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Abstract 

A systematic assessment of global neural network connectivity through direct electrophysiological assays 
has remained technically infeasible, even in dissociated neuronal cultures. We introduce an improved 
algorithmic approach based on Transfer Entropy to reconstruct approximations to structural connectivity 
from network activity monitored through calcium imaging. We focus in this study, as a first step, on 
the inference of excitatory synaptic links. Based on information theory, our method requires no prior 
assumptions on the statistics of neuronal firing and neuronal connections. The performance of our 
algorithm is benchmarked on surrogate time series of calcium fluorescence generated by the simulated 
dynamics of a network with known ground truth topology. We find that the functional network topology 
revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network 
(bursting or non-bursting). Thus by conditioning with respect to the global mean activity, we improve 
the performance of our method. This allows us to focus the analysis to specific dynamical regimes of the 
network in which the inferred functional connectivity is shaped by the underlying excitatory connections, 
rather than by collective synchrony. Our method can discriminate between actual causal influences 
between neurons and spurious non-causal correlations due to light scattering artifacts, which inherently 
affect the quality of fluorescence imaging. Compared to other reconstruction strategies such as cross- 
correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably 
more accurate. In particular, it provides a good reconstruction of the excitatory network clustering 
coefficient, allowing for discrimination between weakly and strongly clustered topologies. Finally, we 
present the applicability of our method to real recordings of in vitro cortical cultures. We demonstrate 
that excitatory connections in these networks are characterized by an elevated level of clustering compared 
to a random graph (although not extreme) and can be markedly non-local. 

Author Summary 

Unraveling general organization principles of connectivity in neural circuits is a crucial step towards un- 
derstanding brain function. However, even the simpler task of assessing the global excitatory connectivity 
of a culture in vitro, where neurons form self-organized networks in absence of external stimuli, remains 
challenging. Neuronal cultures undergo spontaneous switching between episodes of synchronous bursting 
and quieter inter-burst periods. We introduce here a novel algorithm which aims at inferring the connec- 
tivity of neuronal cultures from calcium fluorescence recordings of their network dynamics. To achieve 
this goal, we develop a suitable generalization of Transfer Entropy, an information-theoretic measure of 
causal influences between time series. Unlike previous algorithmic approaches to reconstruction. Trans- 
fer Entropy is data-driven and does not rely on specific assumptions about neuronal firing statistics or 
network topology. We generate simulated calcium signals from networks with controlled ground-truth 
topology and purely excitatory interactions and show that, by restricting the analysis to inter-bursts 
periods. Transfer Entropy robustly achieves a good reconstruction performance for disparate network 
connectivities. We apply, finally, our method to real data and find evidence of non-random features in 
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cultured networks, such as the existence of high-connectivity hub excitatory neurons and of an elevated 
(but not extreme) level of clustering. 



Introduction 



The identification of the topological features of neuronal circuits is an essential step towards the under- 
standing of neuronal computation and function. Despite considerable progress in neuroanatomy, electro- 
physiology and imaging 1-8 , the detailed mapping of neuronal circuits is already a difficult task for a 
small population of neurons, and becomes unpractical when accessing large neuronal ensembles. Even 
in the case of cultures of dissociated neurons, in which neuronal connections develop de novo during the 
formation and maturation of the network, very few details are known about the statistical features of 
this connectivity, which might reflect signatures of self-organized critical activity [£- 
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Neuronal cultures have emerged in the last years as simple, yet versatile model systems 
the quest for uncovering neuronal connectivity 
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The fact that relatively 



appealing for studying the interplay between activity and connectivity. 

The activity of hundreds to thousands of cells in in vitro cultured neuronal networks can be simul- 
taneously monitored using calcium fluorescence imaging techniques 14 21 22 . Calcium imaging can be 
applied both in vitro and in vivo and can be combined with stimulation techniques like optogenetics [23] . 
A major drawback of this technique, however, is that the typical frame rate during acquisition is slower 
than the cell's firing dynamics by an order of magnitude. Furthermore the poor signal-to-noise ratio is 
such to make hard the detection of elementary firing events. 

Neuronal cultures are unique platforms to investigate and quantify the accuracy of network reconstruc- 
tion from activity data, extending analysis tools initially devised for the characterization of macro-scale 
functional networks 24,25 to the micro-scale of a developing local circuit. 

Here we report a new technique to reconstruct the connectivity of a neuronal network from calcium 
imaging data, which is based on information theory. We use an extension of Transfer Entropy (TE) [26 28 



to extract a directed functional connectivity network in which the presence of a directed edge between 
two nodes reflects a direct causal influence by the source to the target node 29-31 . Note that "causal 
influence" is defined operationally as "improved predictability" [32[|33' reflecting the fact that knowledge 



of the activity of one node (putatively pre-synaptic) is helpful in predicting the future behavior of another 
node (putatively post-synaptic) . TE has previously been used to study gene regulatory networks 34 , the 
flow of information between auditory neurons 35 , to infer directed interactions between brain areas based 
on EEG recordings 36 or between different LFP frequency bands [37^, as well as for the reconstruction 
of the connectivity based on spike times 38 39|. 



Importantly, our data-driven approach is model-independent. This contrasts with previous approaches 
to network reconstruction, which were most often based on the knowledge of precise spike times 40 -45], 
or explicitly assumed a specific model of neuronal activity [43 44 . Our method is thus not constrained 
to linear interactions between nodes. This absence of a parametric model can be advantageous not only 
conceptually, where we hope to make the least amount of assumptions necessary, but also practical for 
applications to real data, because the used statistical assumptions might be too restrictive or inappropri- 
ate. 

A problem inherent to the indirect algorithmic inference of network connectivity from real data is 
that the true target topology of the network is not known and that, therefore, it is difficult to assess the 
quality of the reconstruction. In order to characterize the behavior of our algorithm and to benchmark 
its potential performance, we resort therefore to synthetic calcium fluorescence time series generated by 
a simulated cultured neural network that exhibits realistic dynamics. Since the "ground truth" topology 
of cultures in silico is known and arbitrarily selectable, the quality of our reconstruction can be evaluated 
by systematically comparing the inferred with the real network connectivities. 
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We use a simplified network simulation to generate surrogate imaging data, improving their realism by 
the reproduction of light scattering artifacts 6 which ordinarily affect the quality of the recording. Our 
surrogate data reproduce as well another general feature of the activity of neuronal cultures, namely the 
occurrence of temporally irregular switching between states of asynchronous activity, with a relatively 
weak average firing rates, and states of highly synchronous activity, commonly denoted as "network 
bursts" 20^,46.47. 

This switching dynamics poses potentially a major obstacle to reconstruction. As a matter of fact, 
during bursting phases, collective synchronization results in an effective network with densely connected 
components (reflecting communities of highly synchronized neurons). Conversely — and usefully for 
applications — the directed functional connectivity during inter-bursts phases bears a strong resemblance 
to the underlying structural (i.e. synaptic) connectivity, because it reflects dominantly mono-synaptic 
interactions. In order not to average over dynamical regimes leading to very different functional networks, 
we restrict our analysis to the inter-burst dynamical regime only, ignoring bursting transients. This can be 
achieved very simply by conditioning our analysis to activity intervals in which the averaged fluorescence 
level is below a certain threshold, as an indirect but reliable indicator of whether the network is in a 
comparatively "quiet" phase. 

Appropriate conditioning — combined with a simple correction coping with the poor time-resolution 
of imaging data — allows to achieve a good topology reconstruction performance (assessed from synthetic 
data), without the need to infer exact spike times through sophisticated techniques (as is required, on 
the contrary, in [43|[45] ). As a matter of fact, TE out-performs three other standard approaches, namely 
cross-correlation (XC), Granger Causality (GC) and Mutual Information (MI) — all of which have been 
used to study the connectivity in neural networks [4Q||48 53 — , particularly in presence of light scattering. 



Finally, we apply our algorithm — optimized through model-based validation — to the analysis of real 
calcium imaging recordings. For this purpose, we study spontaneously developing networks of dissociated 
cortical neurons in vitro and we address, as a first step toward a full topology reconstruction, the simpler 
problem of extracting their excitatory connectivity only. Early mature cultures display a bursting dy- 
namics very similar to our simulated networks, with which they also share an analogous state-dependency 
of directed functional connectivity. 

Our generalized TE approach identifies thus network topologies with characteristically non-local con- 
nections, consistent with the existence of long axons able to span a large fraction of the culture T3p5l[54 



The retrieved degree distribution is broadened and characteristically right-skewed (but not "scale free", 
in contrast with [42]). Furthermore, our analysis hints at a level of clustering in spontaneously-developing 
cultures which is moderate but significantly larger than expected for random networks sharing the same 
degree distribution. 



Results 

The Results section is organized as follows. After a brief presentation of the qualitative similarity between 
real calcium fluorescence data from neuronal cultures and simulated data, we introduce numerical simula- 
tions showing that networks with very different clustering levels can lead to matching bursting dynamics. 
We develop then our reconstruction strategy, based on a novel generalization of TE, and examine the 
different elements composing our strategy, namely ''same-bin interactions" and conditioning with respect 
to the average fluorescence level. We show that only signals recorded during inter-burst periods convey 
elevated information about the underlying structural topology. After a discussion of criteria guiding the 
choice of the number of links to include in the reconstructed network, we illustrate two specific examples 
of reconstruction, and analyze the factors affecting its quality. The performance achieved by generalized 
TE is systematically contrasted with other standard linear and nonlinear competitor methods, all of them 
modified to include the same enhancements applied to the original TE formulation. Finally, we apply our 
reconstruction algorithm to biological recordings and extract the topological features of actual neuronal 
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cultures. 



Real and surrogate calcium fluorescence data 

In this study, we consider recordings from in vitro cultures of dissociated cortical neurons (see Materials 
and Methods). To illustrate the quality of our recordings, in Fig. [T|^ we provide a bright field image 
of a region of a culture together with its associated calcium fluorescence. As previously anticipated, to 
simplify the network reconstruction problem, experiments are carried out with blocked inhibitory GABA- 
ergic transmission, so that the network activity is driven solely by excitatory connections. We record 
activity of early mature cultures at day in vitro (DIV) 9-12. Such young but sufficiently mature cultures 
display rich bursting events, combined with sparse irregular firing activity during inter-burst periods (cfr. 
Discussion). 

In Fig. ^ (left panel) we show actual recordings of the fluorescence traces associated to five different 
neurons. The corresponding population average for the same time window is shown in Fig. [TJC (left 
panel). In these recordings, a stable baseline is broken by intermittent activity peaks that correspond to 
synchronized network bursts recruiting many neurons. The bursts display a fast rise of fluorescence at 
their onset followed by a slow decay. In addition, during inter-burst periods, smaller modulations above 
the baseline are sometimes visible, despite the poor time-resolution achievable of a frame every few tens 
of a ms. 

In order to benchmark and optimize different reconstruction methods, we also generate surrogate 
calcium fluorescence data (shown in the right panels of Figs.jl^ andjlp), based on the activity of simulated 
networks whose ground truth topology is known. We simulate the spontaneous spiking dynamics of 
networks formed by N = 100 excitatory integrate- and- fire neurons, along a duration of 60 minutes of real 
time, matching typical lengths of actual recordings. Calcium fluorescence time series are then produced 
based on this spiking dynamics, resorting to a model introduced in 43 and described in the Materials 
and Methods section. 

Although in our experiments a number of over 1000 cells is accessible, we observed in the simulations 
that = 100 neurons suffice to reproduce the same dynamical behavior observed for larger network sizes, 
while allowing still for an exhaustive exploration of the entire algorithmic parameter space. Furthermore, 
despite their reduced density, we maintain in simulated cultures the same average probability of connection 
than in actual cultures, where this probability (of p = 0.12, see Materials and Methods) is roughly 



estimated based on independent studies 15 , 55 



The fluorescence signal of a particular simulation run or experiment can be conveniently studied in 
terms of the distribution of fluorescence amplitudes. As shown in Fig. [ijl) for both simulations and ex- 
periments, the amplitude distributions display a characteristic right-skewed shape that emerge from the 
switching between two distinct dynamical regimes, namely the presence or absence of bursts. The distri- 
bution in the low fluorescence region assumes a Gaussian-like shape, corresponding to noise-dominated 
baseline activity, while the high fluorescence region displays a long tail with a cut-off at the level of 
calcium fluorescence of the highest network spikes. As we will show later, qualitative similarity between 
the shapes of the simulated and experimental fluorescence distributions will play an important guiding 
role for an appropriate network reconstruction. 

Note that we do not attempt to reproduce quantitatively the experimental distribution of fluorescence 
through our simplified model. We remark, however, that the right tail of our experimental and simulated 
distributions of fluorescence values cannot be well modeled by a power law. This contrasts with reports 
(56][57] of power-laws in the distribution of burst sizes, even if the comparison might be misleading, due to 



the indirect relation — although linear, at least in a certain range |58| — between fluorescence and burst 
size. 
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Different network topologies lead to equivalent network bursting 

Neurons grown in vitro develop on a bi-dimensional substrate and, hence, both connectivity and clustering 
may be strongly sensitive to the physical distance between neurons. At the same time, due to long 
axonal projections [13^, excitatory synaptic connections might be formed at any distance within the 
whole culture and both activity and signaling-dependent mechanisms might shape non-trivially long-range 
connectivity | 59[[6Q] . 

To test the reconstruction performance of our algorithm, we consider two general families of network 
topologies that cover a wide range of clustering coefficients. In a first one, clustering occurs between 
randomly positioned nodes {non-local clustering). In a second one, the connection probability between 
two nodes decays with their Euclidean distance according to a Gaussian distribution and, therefore, 
connected nodes are also likely to be spatially close. In particular, in this latter case, the overall level of 
clustering is determined by how fast the connection probability decays with distance {local clustering). 
Cortical slice studies revealed the existence of both local |61,62 and non-local 7,63 types of clustering. 

We will later benchmark reconstruction performance for both kinds of topologies and for a wide range 
of clustering levels, because very similar patterns of neuronal activity can be generated by very different 
networks, as we now show. 

Figure [2] illustrates the dynamic behavior of three networks (in this case from the non-local clustering 
ensemble). The networks are designed to have different clustering coefficients but the same total number 
of links (see the insets of Fig. [2^ for an illustration). The synaptic coupling between neurons was adjusted 
in each network using an automated procedure to obtain bursting activities with comparable bursting 
rates (see Methods for details and Table [l] for the actual values of the synaptic weight). As a net effect of 
this procedure, the synaptic coupling between neurons is slightly reduced for larger clustering coefficients. 
The simulated spiking dynamics is shown in the raster plots of Fig. [2]A.. These three networks display 
indeed very similar bursting dynamics, not only in terms of the mean bursting rate, but also in terms of 
the entire inter-burst interval (IBIs) distribution, shown in Fig.|2^. In the same manner, we constructed 
and simulated local networks — with a small length scale corresponding to high clustering coefficients 
and vice versa — and obtained the same result, i.e. very similar dynamics for very different decay lengths 
(not shown). 

We stress that our procedure for the automatic generation of networks with similar bursting dynamics 
was not guaranteed to converge for such a wide range of clustering coefficients. Thus, the illustrative 
simulations of Fig. [2] provide genuine evidence that the relation between network dynamics and network 
structural clustering is not trivially "one-to-one" , despite the fact that more clustered networks have been 
shown to have different cascading dynamics at the onset of a burst 1 42j . 

Extraction of directed functional connectivity 

We focus, first, on the reconstruction of simulated networks, taken from the local and non-local ensembles 
described above. We compute their directed functional connectivity based on simulated calcium signals. 

Discrete differentiation 

Synthetic fluorescence time series are pre-processed only by simple discrete differentiation, such as to 
extract baseline modulations associated to potential firing. These differentiated signals are then used as 
input to any further analyses. 

Generalized TE 

We resort to a modified version of TE that includes two novel features (described in detail in the Materials 
and Methods section), namely the treatment of "same bin interactions" and the ad hoc selection of 
dynamical states. 
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The original formulation of TE was designed to detect the causal influence of events in the past with 
events at a later time. Practically, since calcium fluorescence is sampled at discrete times, standard 
TE evaluates the influence of events occurring in the time-bin t of events occurring in earlier time bins 
t — 1, t — 2, .... By including same bin interactions in TE estimation, we consider also potential causal 
interactions between events that occur within the same time-bin t. This is important when dealing with 
experimental data of real neuronal cultures since the image acquisition rate is not sufficiently high to 
establish the temporal order of elementary spiking events. 

On the other hand, the selection of dynamical states is crucial to properly capture interactions be- 
tween neurons which lead to different activity correlation patterns in different dynamical regimes. Both 
simulated and real neuronal cultures show indeed a dynamical switching between two distinct states 
(bursting and non-bursting) that can be separated and characterized by monitoring the average fluores- 
cence amplitude and restricting the analysis only to recording sections in which this average fluorescence 
falls in a predetermined range. Selection of dynamical states is discussed in the next section. 

From TE evaluation to directed functional networks 

Once TE functional connectivity strengths have been calculated for every possible directed pair of nodes, a 
reconstructed network topology can be obtained by applying a threshold to the TE values at an arbitrary 
level. Only links whose TE value is above this threshold are retained in the reconstructed network 
topology. 

Choosing a threshold is equivalent to choosing an average degree 

As a matter of fact, selecting a threshold for the inclusion of links corresponds to set the average degrees 
of the reconstructed network. As intuitive and as shown in Figure SI A, a linear correlation exists between 
the number of links and the average degree. Because of this relation, an expectation about the probability 
of connection in the culture, and hence, its average degree, can directly be translated into a threshold 
number of links to include. 

Based on the aforementioned estimations of probability of connection and taken into account the 
different sizes of our (smaller) simulated network and of our (larger) experimental cultures, threshold 
values are roughly selected to include the top 10% of links^ for reconstructions of simulated networks, 
and to include the top 5% of links^ for reconstructions from actual biological recordings. These choices 
are such to lead, in both cases, to reconstructed networks with comparable probability of connection, as 
previously mentioned. 

The (limited) impact of a "wrong" threshold selection on the inference of specific topologic features, 
like the clustering coefficient, will be discussed in later sections. 

Extracted functional networks vs ground-truth structural networks 

For the simulated data, the resulting connectivity matrix can be directly compared to the ground truth 
topology, and a standard Receiver- Operator Characteristic (ROC) analysis can be used to quantify the 
quality of reconstruction. ROC curves are generated by gradually moving a threshold level from the lowest 
to the highest TE value, and by plotting at each point the fraction of true positives as a function of the 
fraction of false positives. Examples of ROC curves are shown in Figs. [3]-[7| for different reconstruction 
conditions and network ensembles and are commented in the corresponding sections. 

Alternative functional connectivity metrics 

We compare directed functional connectivity based on TE with alternative networks based on (adapted) 
crosscorrelation (XC), Granger Causality (GC) or Mutual Information (MI) metrics. Detailed definitions 
of them are provided in the Materials and Methods section. The only difference with respect to the 
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procedure just described is evaluating per each directed edge functional coupling scores based on XC, 
GC or MI, rather than on TE. 

Network reconstruction depends on the dynamical states 

Immediately prior to the onset of a burst the network is very excitable. In such a situation it is intuitive 
to consider that the directed functional connectivity can depart radically from the structural excitatory 
connectivity, because local events can potentially induce changes at very long ranges, due to collective 
synchronization, rather than to direct synaptic coupling. Conversely, in the relatively quiet inter-burst 
phases, a post-synaptic spike is likely to be influenced solely by the presynaptic firing history. Hence, 
the directed functional connectivity between neurons is intrinsically state dependent (cfr. also 64 ), a 
property that must be taken into account when reconstructing the connectivity. 

We illustrate here the state dependency of directed functional connectivity by generating a random 
network from the local clustering ensemble and by simulating its dynamics, including light scattering 
artifacts to obtain more realistic fluorescence signals. The resulting distribution of fluorescence amplitudes 
is divided into seven non-overlapping ranges of equal width, each of them identified with a Roman numeral 
(Fig. [sjA.). Finally, TE is computed separately for each of these ranges, based on different corresponding 
subsets of data from the simulated recordings. 

The quality of the reconstruction is summarized by the performance levels which, following an arbitrary 
convention, is measured as the fraction of true positives at 10% of false positives read out of a complete 
ROC curve. 

We plot the performance level as a function of the average fluorescence amplitude in each interval, 
as shown in the blue line of Fig. [3^. The highest accuracy is achieved in the lowest fluorescence range, 
denoted by I, and reaches a remarkably elevated value of approximately 70% of true positives. The 
performance in the higher ranges II to IV decreases to a value around 45%, to abruptly drop at range V and 
above to a final plateau that corresponds to the 10% performance of a random reconstruction (ranges VI 
and VII). 

Note that fluorescence values are not distributed homogeneously across ranges I- VII, as evidenced 
by the overall shape of the fluorescence distribution in Fig. |3|^. For example, the lowest and highest 
ranges (I and VII), differ by two orders of magnitude in the number of data points. To discriminate 
unequal-sampling effects from actual state-dependence phenomena, we studied the performance level 
using an equal number of data points in all ranges. Effectively, we restrict the number of data points 
available in each range to be equal to the number of samples in the highest range, VII. The quality of 
such a reconstruction is shown as the red curve in Fig. [3^. The performance level is now generally lower, 
reflecting the reduced number of time points which are included in the analysis. 

Interestingly, the "true" peak of reconstruction quality is shifted to range II, corresponding to flu- 
orescence levels just above the Gaussian in the histogram of Fig. [3]A.. This range is therefore the most 
effective in terms of reconstruction performance for a given data sampling. 

For the ranges higher than II, the reconstruction quality gradually decreases again to the 10% per- 
formance of purely random choices in ranges VI and VII. The effect of adopting a (shorter) equal sample 
size is particularly striking for range I, which drops from the best performance level almost down to 
the baseline for random reconstruction. As a matter of fact, range I is the one for which the shrinkage 
of sample length due to the constraint for uniform data sampling is most extreme (see later section on 
dependence of performance from sample size). 

The above analysis leads to a different functional network for each dynamical range studied. For 
the analysis with an equal number of data point per interval, the seven effective networks are drawn in 
Fig. [3p (for clarity only the top 10% of links are shown). Each functional network is accompanied with 
the corresponding ROC curve. 

The lowest range I corresponds to a regime in which spiking-related signals are buried in noise. 
Correspondingly, the associated functional connectivity is practically random, as indicated by a ROC 
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curve close to the diagonal. Nevertheless, information about structural topology is still conveyed in the 
activity associated to this regime and can be extracted through extensive sampling. 

At the other extreme, corresponding to the upper ranges V to VII — associated to fully developed 
synchronous bursts — the functional connectivity has also a poor overlap with the underlying structural 
network. As addressed later in the Discussion section, functional connectivity in regimes associated to 
bursting is characterized by the existence of hub nodes with an elevated degree of connection. The spatio- 
temporal organization of bursting can be described in terms of these functional connectivity hubs, since 
nodes within the neighborhood of a same functional hub experience a strongest mutual synchronization 
than arbitrary pair of nodes across the network (see Discussion and also Figure S2). 

The best agreement between functional and excitatory structural connectivity is clearly obtained for 
the inter-bursts regime associated with the middle range II, and to a lesser degree in ranges III and IV, 
corresponding to the early building-up of synchronous bursts. 

Overall, this study of state-dependent functional connectivity provides arguments to define the optimal 
dynamical regime for network reconstruction: The regime should include all data points whose average 
fiuorescence across the population gt is below a "conditioning level" ^, located just on the right side of the 
Gaussian part of the histogram of the average fiuorescence (see Materials and Methods). This selection 
excludes the regimes of highly synchronized activity (ranges III to VII) and keeps most of the data points 
for the analysis in order to achieve a good signal-to-noise ratio. Thus, the inclusion of both ranges I 
and II combines the positive effects of correct state selection and of extensive sampling. 

The state-dependency of functional connectivity is not limited to synthetic data. Very similar pat- 
terns of state-dependency are observed also in real data from neuronal cultures. In particular, in both 
simulated and real cultures, the functional connectivity associated to the development of bursts displays 
a stronger clustering level than in the inter-burst periods. An analysis of the topological properties of 
functional networks obtained from real data in different states (compared with synthetic data) is provided 
in Figure S3. In this same figure, sections of fiuorescence time-series associated to different dynamical 
states are represented in different colors, for a better visualization of the correspondence between states 
and fiuorescence values (for simplicity, only four fiuorescence ranges are distinguished). 

Analysis of two representative network reconstructions 

Our generalized TE, conditioned to the proper dynamic range, enables the reconstruction of network 
topologies even in the presence of light scattering artifacts. For non- locally clustered topologies we 
obtain a remarkably high accuracy of up to 75% of true positives at a cost of 10% of false positives. 
An example of the reconstruction for such a network, with CC = 0.5, is shown in Fig. |4]A.. For locally- 
clustered topologies, accuracy typically reaches 60% of true positives at a cost of 10% of false positives, 
and an example for A = 0.25 mm is shown in Fig. [5}^. 

In both topologies, we observe that for a low fraction of false positives detection (i.e. at high thresh- 
olds Ote) the ROC curve displays a sharp rise, indicating a very reliable detection of the causally 
most efficient excitatory connections. A decrease in the slope, and therefore a rise in the detection of 
false positives and a larger confidence interval, is observed only at higher fractions of false positives. 
The confidence intervals are broader in the case of locally-clustered topologies because of the additional 
network-to-network variability that results from the placement of neurons (which is irrelevant for the 
generation of the non- locally clustered ensembles, see Materials and Methods). 

Non-local clustering ensemble 

To address the reconstruction quality of the network topology, we focus first on the results for the non- 
local clustered ensemble. For a conditioning level which corresponds to the right hand side of the Gaussian 
in the fiuorescence amplitude histogram {g 0.2), we consider three main network observables, namely 
the distributions of local clustering coefiicients, in-degrees, and the distances of connections. As shown 
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in Fig. |4^, we obtain a reconstructed network that reproduces well the ground truth properties, with 
similar mean values and distributions for all three observables considered. We observe, however, a small 
shift towards lower clustering indices (Fig. [4^, top panel) and especially towards lower average distances 
(bottom panel) for this highly clustered network. 

Despite this underestimation bias for instances with high clustering. Fig. |4p shows the existence 
of a clear linear correlation between the real average clustering coefficient and the one of the topology 
reconstructed with generalized TE (Pearson's correlation coefficient of r = 0.92). Such linear relation 
allows, notably, a reliable discrimination between networks with different levels of clustering but very 
similar bursting dynamics. Note that this linear relation between real and reconstructed clustering 
coefficient is robust against misestimation of the expected average degree, or, equivalent ly, of the number 
of links to include, as highlighted by Figure SIB. 

TE-based reconstructions yield also estimates of the average distance of connection — constant and not 
correlated with the clustering level — with reasonable accuracy as shown in Supplementary Figure S4A. 

Local clustering ensemble 

For this ensemble (see Fig. [5|, the quality of reconstruction can be assessed even visually, by plotting 
the network graph of reconstructed connections, due to the distance-dependency of the connections. In 
Fig. we compared the structural network (top panel) with the reconstructed one (bottom panel), 
obtained by including as links only edges corresponding to the top 10% of TE values. This corresponds 
here to about 600 true positives (~ 50% of all possible true positives, and plotted in green) and about 
400 false positives (~ 5% of all possible false positives, plotted in red). The statistical properties of the 
structural and reconstructed networks are shown in Fig. |5p. Again, reconstructed network properties 
correlate with real properties. The reconstructed distribution of connection distances displays a reduced 
right-tail compared to the real one. A tendency to estimate a more local connectivity is evident also from 
a marked overestimation of local clustering coefficients. We attribute such a mismatch to light scattering 
artifacts that increase local correlations in a spatial region matching the length scale of real structural 
connections. This is conffimed by the fact that the length scale is correctly inferred in simulations without 
the light scattering artifact (not shown). 

Note, that there is again a very good linear correlation (Pearson's correlation of r = 0.97) between 
the actual and reconstructed (spatial) average connection length, as shown in Fig. [Sp. Similarly, the 
reconstructed average clustering coefficient is linearly correlated with the ground truth one (r = 0.98), 
as shown in Supplementary Figure S4B. 

Sensitivity to reconstruction approaches 

Overall, TE of Markov order k = 2 (i.e. taking into account multiple time scales of interaction, see eq. 
9 in Methods section) achieved a performance level ranging between 40% and 80% at a level of 10% of 
false positives, for any clustering type and level. 

In Fig. [4p and Fig. [5p we compare the performance of generalized TE with other reconstruction 
strategies, respectively for the non-local and for the local clustering ensemble. We observe then that 
Ml-based reconstructions yield linear correlations between real and reconstructed clustering coefficient 
and length scales as well. For the adopted optimal conditioning level, MI can actually out-perform 
generalized TE (cfr. Supplementary Figure S4), due probably to the smaller sample size required for 
its estimation. On the contrary, XC-based reconstructions fail in all cases to reproduce these linear 
correlations, yielding a constant value independently from the ground-truth values. For the non-local 
clustering ensemble, it distinctly over-estimates average clustering level; for the local clustering ensemble, 
it severely underestimates the average length of connections. Therefore, in XC-based reconstructions, 
all information on the actual degree of clustering in the network is lost and a high clustering level is 
invar iantly inferred. 
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GC-based reconstructions display the same error syndrome (not shown), which indicates that captur- 
ing non-linear correlations in neural activity — as MI and TE can do, but XC and GC cannot — is crucial 
for the inference of the clustering level. 

We would like to remind that XC-, GC- and Ml-based methods, analogously to the generalized TE 
approach, include, as generalized TE, the possibility of "same-bin interactions" (zero-lag). Furthermore, 
we have modified them to include as well optimal conditioning in order to make the comparison between 
different methods fair. The forthcoming section gives more details about comparison of performance for 
different conditions and methods. 

Performance comparison: Role of topology, dynamics and light scattering 

The performance level (fraction of true positives for 10% of false positives, denoted by TPio%) provides a 
measure of the quality of the reconstruction, and allows the comparison of different methods for different 
network topologies, conditioning levels, and external artifacts (i.e. presence or absence of simulated 
light scattering). We test linear methods, XC and GC (of order 2; the performance of GC of order 1 is 
very similar and not shown), and non-linear methods, namely MI and TE (of Markov orders 1 and 2; 
see Materials and Methods for details). XC and MI are correlation measures, while GC and TE are 
causality measures. Note that, for each of these methods, we account for state dependency of functional 
connectivity, performing state separation as described in the Materials and Methods section. 

In the case of the non-local clustering ensemble and without light scattering (Fig. |6j top row), even 
a linear method such as XC achieves a good reconstruction. This success indicates an overlap between 
communities of higher synchrony in the calcium fiuorescence, associated to stronger activity correlations, 
and the underlying structural connectivity, especially for higher full clustering indices. 

GC-based reconstructions have an overall worse quality, due to the inadequacy of a linear model for 
the prediction of our highly nonlinear network dynamics, but they show similarly improved performance 
for higher CC. 

In a band centered around a shared optimal conditioning level g 0.2, both MI and generalized TE 
show a robust performance across all clustering indices. This value is similar to the upper bound of 
the range II depicted in Fig. [3]A., i.e. it lies at the interface between the bursting and silent dynamical 
regimes. In particular for TE and in the case of low clustering indices (which leads to networks closer 
to random graphs), conditioning greatly improves reconstruction quality. At higher clustering indices 
the decay in performance is only moderate for conditioning levels above the optimal value, indicating an 
overlap between the functional connectivities in the bursting and silent regimes. Note, on the contrary, 
that the performance of MI rapidly decreases if a non-optimal conditioning level is assumed. 

The introduction of light scattering causes a dramatic drop in performance of the two linear methods 
(XC and GC), and even of MI and TE with Markov order k = 1. The performance of TE at Markov 
order 2 also deteriorates, but is still significantly above the random reconstruction baseline in a broad 
region of parameters. Interestingly, for the optimal conditioning level ^ :^ 0.2 the performance of the 
TE for /c = 2 does not fall below TPio% ~ 40% for any clustering level or A value. It is precisely in 
this optimal conditioning range that we obtain the linear relations between reconstructed and structural 
clustering coefficients, for both the non-local and the local clustering ensembles. 

A similar trend is obtained when varying the length scale A in the local ensembles (see Supplementary 
Figure S5). For very local clustering and without light scattering, both XC and TE achieve performance 
levels up to 80%. The introduction of light scattering, however, reduces the performance of all measures 
except for MI (but only in the narrow optimal conditioning range) and for TE of higher Markov orders 
(robust against non-optimal selection of conditioning level). Overall, the performance of the reconstruc- 
tion for the local clustering ensembles is lower than for the non- locally clustered ensembles. This is true, 
incidentally, also in absence of light scattering, since networks sampled from this ensemble, as soon as the 
length scale is sufficiently long, tend to be very similar to purely random topologies (of the Erdos-Renyi 
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type, see e.g. [^), for which performance is generahy poorer (cfr. top row of Figure|6j for weak clustering 
levels). 

Contributions to the performance of generalized TE 

Our new TE method significantly improves the reconstruction performance compared to the original TE 
formulation [Eq. ([9|]. As shown in Fig. [tJA. for both the local and the non-local clustered networks, 
reconstruction with the original TE formulation (Fig. [tJA., blue line) yields worse results than a random 
reconstruction, as indicated by the corresponding ROC curves falling below the diagonal. Such a poor 
success is due in large part to "misinterpreted" delayed interactions. Indeed, by taking into account same 
bin interactions, a boost in performance is observed (red line). Fig.[7]A. also shows that an additional leap 
in performance is obtained when the analysis is conditioned (i.e. restricted) to a particular dynamical 
state of the network, increasing reconstruction quality by 20% (yellow line in Fig.[7|^). The determination 
of the optimal conditioning level is discussed later and takes into account the considerations introduced 
above (cfr. Fig. |3|. 

Note that the introduction of same bin interactions alone (red color curves) or conditioning on the 
dynamical state of network alone (yellow color curves) already brings the performance to a level well 
superior to random performance. However, at least for our simulated calcium-fiuorescence time series, 
a remarkable boost in performance is obtained only when same-bin interactions inclusion and optimal 
conditioning are combined together (green color curves). Although, in principle, conditioning is enough to 
select indirectly a proper dynamical regime, the poor time-resolution of the analyzed signals (constrained 
not only by the frame-rate of acquisition but also intrinsically by the kinetics of the dissociation reaction 
of the calcium- sensitive dye |66j) requires as well the potential consideration of causally-linked events 
occurring in a same time-bin. 

A different way to represent reconstruction performance are "Positive Precision Curves" (as introduced 
in [48] and described in the Materials and Methods section) by plotting, at a given number of reconstructed 
links, the "true- false ratio" (TPR), emphasizing the probability that a reconstructed link is present in 
the ground truth topology (true positive). For the same networks and reconstruction as above, we plot 
the PPCs in Fig. S6 (for the ROC curves see Fig.[7]A.). Over a wide range of the number of reconstructed 
links (TFS), the PPC displays positive values of the TFR, indicative of a majority of true positives over 
false positives. For both the locally and non-locally clustered networks, the PPC reaches a maximum 
value of the TPR about 0.5 and remains positive up to about 18% of included links for the clustered 
topology, or up to 12% in the case of the local topology. 

Recording length affects performance 

In Fig. [7^, we analyze the performance of our algorithm against changes of the sample size. Starting from 
simulated recordings lasting Ih of real time (corresponding to about 360 bursting events) and with a full 
sample number of of iSih, we trimmed these recordings producing shorter fluorescence time series with 

— S\\^l s samples, with s being a divisor of the sample size. For both network topology ensembles, 
we found that a reduction in the number of samples by a factor of two (corresponding to 30 minutes or 
about 180 burst) still yields a performance level of ~ 70%. By further reducing the sample size, we reach 
a plateau with a quality of ~ 30% for about 40 bursts (corresponding to 6 minutes). 

All the experiments analyzed in this work are carried out with a duration between 30 and 60 minutes. 
Since conditioning, needed to achieve high performance, requires one to ignore a conspicuous fraction of 
the recorded data, we expect long recordings to be necessary for a good reconstruction, albeit the fact 
that it is possible to increase the signal-to- noise ratio by increasing the intensity of the fluorescent light. 
However, the latter manipulation has negative implications for the health of neurons due to photo-damage, 
limiting our experimental recordings to a maximum of 2 hours. 
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Analysis of biological recordings 

We apply our analysis to actual recordings from in vitro networks derived from cortical neurons (see 
Methods). To simplify the network reconstruction problem, experiments are carried out with blocked 
inhibitory GABA-ergic transmission, so that the network activity is driven solely by excitatory connec- 
tions. This is consistent with previously discussed simulations, in which only excitatory neurons were 
included. 

We consider in Fig. [S] a network reconstruction based on a 60 minutes recording of the activity of 
a mature culture, at day in vitro (DIV) 12, in which N = 1720 active neurons were simultaneously 
imaged. A fully analogous network reconstruction for a second, younger dataset at DIV 9 is presented in 
Supplementary Figure S7. In general, fluorescence data was not affected neither by photo-bleaching nor 
by photo-damage during this time, as proved by the stability of the average fluorescence signal shown in 
the Supplementary Figure S8A. 

The probability distribution of the average fluorescence signal is computed in the same way as for the 
simulated data. Neuronal dynamics and the calcium fluorescence display the same bursting dynamics 
well captured by the simulations, leading to a similar fluorescence distribution (Fig. [ip). Thanks to 
this similarity we can make use of the intuition developed for synthetic data to estimate an adequate 
conditioning level. We select therefore a conditioning level such as to exclude the right-tail of high 
fluorescence associated to to fully-developed bursting transient regimes. We have verified, however, that 
the main qualitative topological features of the reconstructed network are left unchanged when varying 
the conditioning level in a range centered on our "optimal" selection. More details on conditioning level 
selection are given in the Materials and Methods section. 

Reconstruction analysis is carried out for the entire population of imaged neurons. We analyze a 
network defined by the top 5% of TE-ranked links, as discussed in a previous section. Such choice leads 
to an average in-degree of about 100, compatible with average degrees reported previously for neuronal 
cultures of corresponding age (DIV) and density p^[67j . 

Comparison with randomized networks 

The ground truth excitatory connectivity is obviously not known for real recordings and performance 
cannot be assessed by means of an ROC analysis. However, we can compare the obtained reconstruction 
to randomized variations to identify non-trivial topological features of the reconstructed network. We 
perform two kinds of randomizations: In a first one, randomization is full and only the number of network 
edges is preserved. Comparison with such fully randomized networks detects significant deviations of the 
reconstructed network from an ensemble of random graphs in which the degree follows the same prescribed 
Poisson distribution for each node (Erdos-Renyi ensemble, see e.g. 65 ). In a second randomization, both 
the total edge number and the precise out-degrees of each node are preserved. The comparison of the 
reconstructed network with such a partially randomized ensemble detects local patterns of correlations 
between in- and out-degrees — including, notably, clustering — which do not arise just in virtue of a 
specific distribution of out-degrees. Comparison with partial randomizations is particularly important 
when skewed distributions of degrees are expected. 

Topology retrieved by TE 

Our analysis shows that the resulting reconstructed topology is characterized by markedly non-local 
structures, as visible in the portion of the reconstructed network in Fig. Distributions of degree, 
distance of connection and local clustering coefficients inferred by TE are shown in the top row of Fig. [8^ 
(yellow histograms). The degree distribution is characteristically broadened and distinctly right-skewed, 
deviating from the Poisson distribution associated to Erdos-Renyi random graphs (the histogram for fully 
randomized networks shown in blue). Note that, for partial randomizations (histograms shown in red). 
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we have randomized the out-degree of each node but plotted here the resulting in-degree distribution (the 
distribution of out-degrees would be, by construction, unchanged). 

While the distribution of connection distances matches the one of randomized networks, TE detects 
clustering at a level which is moderate (CC c:^ 0.09) but significantly larger than for random networks. 
Note that this larger clustering cannot be ascribed to the broadened degree distribution since both full 
and partial (red histograms) randomizations lead to consistently smaller clustering levels (CC 0.05). 

Topology retrieved by XC 

By analyzing a network reconstruction based on cross-correlation (XC), we find differences to TE (bottom 
row of Fig. [8^). In particular, XC infers a distribution of distances markedly more local than for full and 
partially randomized network instances and, correspondingly, a markedly higher clustering (CC 0.17). 
Distribution of degrees inferred by XC is on the contrary random-like. 

As a matter of fact, remarkably similar patterns of discrepancy between reconstruction results based 
on TE and on XC are also robustly present in synthetic data. Synthetic data analyses consistently show 
a superior performance of TE compared to XC. Furthermore, these analyses identify a tendency of XC 
to infer an artificially too local and too clustered connectivity. Therefore, we believe that the topology 
of the neuronal culture inferred by XC is not reliable, and is biased by the aforementioned systematic 
drifts. 

Discussion 

Relation to state of the art 

We have introduced a novel extension of Transfer Entropy, an information theoretical measure, and 
applied it to infer excitatory connectivity of neuronal cultures in vitro. Other studies have previously 
applied TE (or a generalization of TE) to the reconstruction of the topology of cultured networks [39^,48] . 
However, our study introduces and discusses important novel aspects, relevant for applications. 

Model independency 

Our algorithm is model-independent and can be used virtually without modifications even for the recon- 
struction based on spike trains or voltage traces. This is important, since massive datasets with modalities 
beyond calcium fiuorescence imaging might become available in a near future, thanks to progresses in 
connectomics research. This model-independence, as previously mentioned, is also important to avoid 
potential artifacts due to a wrong choice of model for neuronal firing or for network topology. Therefore, 
it constitutes a major advantage with respect to regression methods or even more elaborated Bayesian 
approaches, as the one considered in |45 . Both regression and Bayesian techniques indeed assume specific 
models of calcium fiuorescence and neuronal firing dynamics, either explicitly (in the case of the Bayesian 
framework) or implicitly (assuming a linear dynamical model in the case, e.g., of XC or GC). Note that 
we use here dynamical network models to benchmark our reconstruction quality. However, our method 
still remains model-free, because knowledge about these models is not required for reconstruction. 

No need for spike times 

Competitor approaches used |48 or put emphasis on the need of reconstructing exact spike times |45 with 
sophisticated deconvolution techniques 43 , as a preprocessing step before actual topology reconstruction. 

As we have shown here, acquiring such difficult-to-access information is unnecessary for our method, 
which performs efficiently even for slow calcium ffuorescence acquisition rate and operates directly on 
imaging time series. This is a crucial feature for applications to noisy data, which remains useful even 
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when — as for the data analyzed in this study — the signal-to-noise ratio is sufficiently good to allow 
sometimes the isolation of individual ffiing events (cfr. Fig. [l^). 

Robustness against bursting 

We optimized our algorithm to infer excitatory connectivity based on time series of calcium fluores- 
cence with a complex nonlinear dynamics, capturing the irregular bursting and the corresponding time- 
dependent degree of synchronization observed in cultured networks in vitro. To our knowledge, no pre- 
vious study about algorithmic connectivity reconstruction has tackled with simulated dynamics reaching 
this level of realism. We have here identified a simple and conceptually elegant mean-field solution to the 
problem of switching between bursting and non-bursting states, based just on conditioning with respect 
to the average level of fluorescence from the whole culture. 

A feature of our model network dynamics, and one that is crucial to reproduce temporally irregular 
network bursting, is the inclusion of short-term depressing synapses. Remarkably, other studies 39 , 
which have modeled explicitly more complex forms of spike-time dependent synaptic plasticity, neglect 
completely this short-term plasticity, failing correspondingly to generate a realistic model of spontaneous 
activity of an in vitro culture. Yet, our network model remains very simplified, although the use of 



networks of integrate-and-fire neurons to generate surrogate data is widespread 39 ,45 48 . Several 
features of real cultured neurons are not explicitly included, like heterogeneity in synaptic conductances 
and time-constants, slow NMDA excitatory currents or distance-dependent axonal delays. However, time- 
series from more complex models could be used to benchmark our algorithm, without need of introducing 
any change into it, due to its model- free nature. 

Robustness against light scattering 

We have found that, among the tested methods, only generalized TE of at least Markov order k = 2 
with a proper conditioning allows to distinguish random from clustered topologies and local from long- 
range connectivities in a reliable manner, in the presence of light scattering artifacts. These artifacts 
indeed lead to the inference of spurious interactions between the calcium signal of two nodes, reducing 
the performance of linear causality measures like XC or GC to a random level. Note that this is very 
likely a similar effect as in 36 , where reconstruction with TE is still possible despite cross-talk between 
EEG electrodes. 

Low computational complexity 

Finally, we note that our algorithm is computationally simple and relatively efficient. Preprocessing 
of time series is a simple discrete differentiation. State selection is achieved via conditioning data on a 
range, which requires only to read once the input time series and compare them with a threshold. The 
inference procedure itself is not iterative (unlike in [42 "45') and evaluation of TE is done through simple 
"plug-in" estimation, which is fast (unlike elementary steps in, e.g., 45 ). The principal determinant of 
computational complexity is therefore the growing number of putative links (which grow as 0{N'^)) and, 
hence, of TE scores that must be computed. 

For the recording duration of one hour considered in this work, we typically found a computation 
time of approximately Tcomp ^ 60ms * N'^ including pre-processing on a 2.67 GHz Intel Xeon processor. 
Reconstructions of networks with 100 nodes required roughly ten minutes; with 1000 nodes roughly half 
a day. Note however that as the computation of TE for two distinct links is (after pre-processing and 
conditioning) computationally independent, it can be easily parallelized, reducing the computation time 
by a factor equal to the number of CPUs. 
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Good reconstruction of topological features 
Good reconstruction of clustering 

Based on synthetic time series of calcium fluorescence, we have studied the relation between ground truth 
topological features and their reconstructed counterparts. By restricting directed functional connectiv- 
ity estimation to a proper dynamical regime through conditioning, we found strong linear correlations 
between real and reconstructed topological properties, both for the average Euclidean distance of con- 
nections and the clustering coefficient (CC). Note that deviations from this linear relationship at higher 
CC values, visible in Fig. [4p, are in part due to the fact that we have used a constant conditioning level 
for all reconstructions, while the optimal conditioning level increases slightly for high CCs. 

Clustering coefficient is the most widespread measure of higher order topological features, going 
beyond characterization of neighbors in a network, but analyzing correlation between local neighborhoods 
of different nodes. We have adopted the so-called "full" clustering index to measure clustering in our 
directed networks. However, several other definitions of clustering coefficient for directed network exist, 
emphasizing the contribution to the clustering phenomenon of different topological motifs such as cycles 
or "middleman" loops [68 . We have checked that the linear relationship between real and reconstructed 
clustering indices hold for all the clustering index types defined in [68], with an almost identical degree 
of correlation. This hints at a good capacity of our algorithm to reconstruct different classes of graph 
topology motifs. 



Good reconstruction of connectivity motifs 

We have focused on the performance of our algorithm in reconstructing specific topological motifs in- 
volving more than two nodes. Considering for instance the network whose reconstruction is illustrated in 
Fig. |5j we identified in the ground truth topology of this network all occurrences of shared source motifs, 
i.e. motifs in which a node A was connected to a node B {A ^ B) and to a node C {A ^ C) but in 
which there was no direct connection between B and C in the reconstructed as well as the ground truth 
topology. Such a motif might be harmful for reconstruction, because the existence of a shared input A 
might be mistaken for a reciprocal causal interaction between nodes B and C. Nevertheless, only in 
a minority of cases (about 20%), spurious links B ^ C or C ^ B where erroneously included in our 
reconstruction. Equivalently, in the case of embedded chains, i.e. motifs in which a node A is connected 
to B, connected on its turn to C, without a direct link from A to C {A ^ B ^ C), the presence of a 
link A ^ C (reflecting potentially indirect causation, mistaken for direct) was erroneously inferred only 
in, once again, about 20% of cases. Note that other studies have investigated the performance of various 
metrics in the reconstruction of specific small network motifs 69,70 , but it is not clear that the efficiency 
of reconstruction quantified for such small networks continues to hold when these motifs are embedded 
in larger networks with hundreds of nodes or more. 

Good reconstruction of higher-order topological features is important since these features (like e.g. 
the tendency of existing links to form chains) are known to affect the synchronizability of networks. 



as stressed by a recent study 71 . Our algorithm was not specifically optimized to infer higher order 
structures. Analyses based on /c-ples rather than on pairs of nodes might lead to a better description 
of higher order connectivity structures, at a price, however, of a more severe sampling problem. In this 
sense, conditioning transition probabilities for a specific edge to the ongoing mean- field activity, constitute 
already a first compromise, allowing to extend the analysis beyond a mere pairwise approach. 



Overestimation of bidirectional links 



In 71 , the role played by the fraction of bidirectional links (i.e. situations in which there is a connection 
A B, but also a connection B ^ A) was also explored. 
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In our case, approximately 60% of the bidirectional links present in the ground truth topology were 
reconstructed as bidirectional in the reconstruction of Fig. |4] However, this reconstruction exhibited 
as well a severe overestimation of the number of bidirectional links. Unidirectional in the ground-truth 
topology were more difficult to detect, such as only 4% of them were ranked among the top 10% of TE 
scores. Furthermore, when such unidirectional links A ^ B were reconstructed, the presence of a link 
B ^ A was also in most of the cases (85%) spuriously inferred, i.e. included unidirectional links were 
often mistaken for bidirectional links. 

It is likely that an improvement in the inference of directionality might be achieved by increasing the 
time-resolution of the recordings beyond the present limits of calcium imaging techniques. 



Good reconstruction of connectivity length scale 

Finally, we have also analyzed the distance dependence of the reconstructed probability of connection 
in the system. We have found a good agreement between real and reconstructed average length scale of 
connections, albeit we did find more local connections than present in the ground truth topology (see 
for example Fig. [4^, bottom panel). This is an artifact due to light scattering, as it is suggested by the 
underestimated peak of the reconstructed connection distance histogram, which matches the characteristic 
length scale of simulated light scattering Age- 



Functional connectivity hubs 

As shown in Fig. |3] for simulated data and in Supplementary Fig. S3 for actual culture data, epochs of 
synchronous bursting are associated to functional connectivity with a stronger degree of clustering and a 
weaker overlap with the underlying structural topology. This feature of functional connectivity is tightly 
related to the spatio-temporal organization of the synchronous bursts. 

In figure S2A, we highlight the position of selected nodes (of the simulated network considered in 
Fig.|3|, characterized by an above-average in-degree of functional connectivity (see Materials and Methods 
for a detailed definition in terms of TE scores) . We denote these nodes — different in general for each of 
the dynamic regimes numbered from I to VII — as (state-dependent) functional connectivity hubs. 

Given a specific hub, we can then define the community of its first neighbors in the corresponding 
functional network. Consistently across all dynamic ranges (but the noise-dominated range I) we find that 
synchronization within each of these functional communities is significantly stronger than between the 
communities centered on different hubs {{p < 0.01, Mann- Whitney test, see Materials and Methods). The 
results of this comparison are reported in Figure S2B, showing particularly marked excess synchronization 
during burst build-up (ranges II, III and IV) or just prior to burst waning in the largest-size bursts (VII). 
Therefore, functional connectivity hubs reflect foci of enhanced local bursting synchrony. 

Other studies (in brain slices) reported evidence of functional connectivity hubs, whose direct stimu- 



lation elicited a strong synchronous activation 72 , 73 . In 72 , the functional hubs were also structural 



hubs. In the case of our networks, however only functional hubs associated to ranges II and III have an 



in-degree (and out-degree) larger than average as in 72 . In the other dynamic ranges, this tight corre- 



spondence between structural and functional hubs does not hold anymore. Nevertheless, in all dynamic 
ranges (but range I), we find that pairs of functional hubs have an approximately three-times larger 
chance of being structurally connected than pairs of arbitrarily selected nodes (not shown) . 

The timing of firing of these strong-synchrony communities is analyzed in Figure S2C. There we 
show that the average time of bursting of functional communities for different dynamic ranges is shifted 
relatively to the average bursting time over the entire network (details of the estimation are provided 
in Materials and Methods). This temporal shift is negative for the ranges II and III (indicating that 
functional hubs and related communities fire on average earlier than the rest of the culture) and positive 
for the ranges V to VII (indicating that firing of these communities occurs on average later than the 
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rest of the culture). The highest negative time delay is detected in range III, such that the communities 
organized around its associated functional hubs can be described as local burst initiation cores 



Purely excitatory networks 

In this study we did not consider inhibitory interactions, neither in simulations nor in experiments 
(GABAergic transmission was blocked), but we attempted uniquely the reconstruction of excitatory 
connectivity. 

We would like to point out that this is not a general limitation of TE, since the applicability of 
TE does not rely on assumptions as to the specific nature of a given causal relationship - for instance 
about whether a synapse is excitatory or inhibitory. In this sense, TE can be seen as a measure for the 
absolute strength of a causal interaction, and is able in principle to capture the effects on dynamics of 



both inhibitory and excitatory connections. Note indeed that previous studies 39,48 have used TE to 
infer as well the presence of inhibitory connections. However, TE alone could not discriminate the sign of 
the interaction and additional post-hoc considerations had to be made in order to separate the retrieved 



connections into separate excitatory and inhibitory subgroups (e.g., in 39 , based on the supposedly 



known existence of a difference in relative strength between excitatory and inhibitory conductances). 
A simpler complex problem 

By focusing on excitatory connectivity only, our intention was to simplify the full problem of network 
reconstruction, aiming as a first step to uncover systematically the strongest excitatory links in the 
network. Such simplification allows indeed to remove potential causes of error, like, e.g. disynaptic 
inhibition or synchronous excitatory and inhibitory inputs to a same cell, that might result in failed 
inference of both the excitatory and the inhibitory connections. Nonetheless, this simpler problem remains 
very difficult because we attempt to reconstruct not only few connections but an entire adjacency matrix 
for the excitatory subnetwork. 

Excitation shapes spontaneous activity 

The inference of excitatory connectivity is by itself a very relevant issue. Excitatory recurrent connections 
are indeed a strong — if not the main — determinant of spontaneous activity. They act as an "amplifier" , 
propagating to the network locally generated firing, and guaranteeing thus that the level of spontaneous 
activity of the culture remains elevated. Furthermore, modeling studies (see e.g. [9j|76)) suggest that 
excitatory connections only are sufficient to obtain network bursts with experimentally observed statistics 
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Sharper signals 

Second, when moving to experimental data, neurons fire more strongly when inhibition is blocked, which 
makes the identification of firing occurrences more accurate. The amplitude of the fiuorescence signal 
increases by a factor two or more when inhibition is blocked (see e.g. 67 ). In Fig. S8B we show an example 
of the fiuorescence signal for the same neuron, before (blue) and after (black) blocking inhibitory synaptic 
transmission. Therefore the reconstruction problem of purely excitatory networks becomes simpler, not 
really from the algorithmic side (since the algorithm is potentially ready to cope with inhibition), but 
rather from the experimental side, because of an improved signal-to- noise ratio. Thus, when we generate 
synthetic data for algorithmic benchmarking, we aim at reproducing these same optimal experimental 
conditions. 

We also note that the distribution of fiuorescence values would be different for recordings with and 
without inhibition. Nevertheless, our method might still be applied, without need of qualitative changes, 
and state selection might still be performed. The optimal range for conditioning would be different and 
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should again be estimated through a model-based benchmarking, but this would lead only to quantitative 
adjustments. 



A two-steps strategy? 

A possible strategy to extend our method to the reconstruction of inhibitory connections could be to follow 
a two-steps experimental approach: first, reconstruct only excitatory connectivity, based on recordings 
in which inhibitory transmission is blocked; and second, reconstruct only inhibitory connectivity, based 
on recordings in which, after the wash-out of the culture, excitatory transmission is blocked. In such 
an experiment, when recurrent excitation is blocked, the spontaneous level of firing activity should be 
restored by chemical non-synaptic activation. We note however, that although complete blockade of 
excitation combined with drug-induced network activation is a standard protocol in slice studies (cfr. 
e.g. [77]), such an approach has never been attempted in cultures of dissociated neurons. 

A compromise might be, therefore, to systematically compare the reconstructed connectivity before 
and after the wash-out of the inhibition blocker thereby collecting indirect evidence about the existence 
of inhibitory connection, through an analysis of the modulatory action they exert on the causal strengths 
of previously inferred excitatory connections. Note that, at least according to [78 , GABAergic synapses 
would be able to activate voltage dependent Ca^^ channels and elevate the intracellular calcium con- 
centration only before GAB A switch (occurring around DIV 7, see later discussion about culture age). 
Hence, only before such early developmental event one could observe a modification of the fluorescence 
signal due to GABA action, making a two-steps approach necessary (first excitation, through direct 
analysis, and then, inhibition, indirectly through connectivity comparison). 

Note also that the overall strength of inhibition in the network could be revealed by studying the decay 
in bursting activity as a result of gradually weaker excitation. Such an approach is possible by targeting 
the AMPA-glutamate receptors in excitatory neurons with increasing concentrations of antagonists such 
as CNQX (cfr. ^14^|67j ), and in the presence or absence of inhibitory synapses. 



Experimental paradigm 

Connectivity in cultures vs. connectivity in vivo 

We have here used our current TE algorithm to the inference of connectivity from calcium fluorescence 
recordings of in vitro cultured networks of dissociated cortical neurons. However, such studies of cultures 
do not yield direct information about the connectivity of cortical tissues in vivo. Several factors contribute 
to shape the neuronal circuits in an intact living brain, including the adequate orientation of neurons, 
dendrites, and axons, the biochemical guidance of processes towards their targets, and the refinement of 
circuitry through activity. Ultimately, development leads to a complex cortical structure organized both 
in layers and in columns, and with many particular topological features such as clusters or hierarchical 
organization 5 . 

All these structures — which are preserved in slices [7[|62j[79j[80] — are completely lost during the 
dissociation of the embryonic brain that precedes the preparation of in vitro cultures. Neuronal cultures 
form new connections "from scratch" , with a combination of short and long length connections, leading 
to circuits that have orders of magnitude less neurons and synapses. 

Nevertheless, understanding how neurons wire together spontaneously in a controlled medium is also 
of utmost importance, because it allows separating endogenous and exogenous driving components of 
neuronal wiring. Furthermore, cultures and slices share similar spontaneous bursting dynamics. If this 
observation alone should not be used to support the existence of shared topological features (cfr. Fig. |2|, 
it is true that the self-organization principles underlying the development of networks up to a bursting 
dynamical state may be common in all living neuronal networks. 
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Calcium imaging vs. multielectrode arrays 

Our algorithm has been optimized for the apphcation to real calcium imaging data, by determining an 
optimal conditioning range based only on qualitative features of the distribution of the average fluo- 
rescence in the network (very similar for synthetic and real data). Other studies have however used 
electrophysiological recordings from cultures grown on multielectrode arrays (MEAs) p!2{p!5lp!7[| 1 8 [[56] as 



a basis for their topology reconstruction strategies (see e.g. 39 48]). 

MEAs provide excellent temporal resolution and, to a certain extent, also the possibility to resolve 
individual spikes. However, MEAs have a limited number of electrodes and often neurons are not precisely 
positioned on an electrode but at its vicinity, which requires complex processing of the data to identify 
the source of a given spike. Additionally, the electrodes have to be in contact or embedded into the 
neuronal tissue, limiting its use to mostly cultures and brain slices. 

Calcium imaging, in contrast, offers important advantages. First, the technique provides access to the 
activity of thousands of neurons in large fields of view for several hours [13 ,14 , and with a time resolution 
proven to be sufficient for reconstruction in the present study. Second, calcium imaging with superior 
temporal and spatial resolution f22''58^ is a technique that combines perfectly with new breakthroughs in 
experimental neuroscience, particularly optogenetics [23^,^ and genetically encoded calcium indicators 



81 82 , technologies that are under expansion both in vitro 83 and in vivo 84 . Our reconstruction 
method can promptly be readapted for the analysis of other calcium imaging datasets. The analysis of 
in vitro cultures of dissociated neurons is just a first proof-of-concept of the applicability of generalized 
TE to real data. 

As a matter of fact, extended TE might even be applied to multielectrode array data with very few 
modifications, along lines analogous to 39 . The advantages of an increased time-resolution might then 
be combined with the efficacy of the state-selection through conditioning concept. 

Age of neuronal cultures 

In our study we recorded from early mature (DIV 9-12) instead of fully mature cultures. Young but 
sufficiently mature cultures have rich bursting events while preserving some isolated activity . On the 
contrary, fully mature cultures show strong synchronized bursting dynamics (epileptic-like), with very 
little isolated neuronal activity 85 . In this sense, therefore, young cultures emerge as a model system 
which better matches the features of non pathologic cortical tissue activity. At the same time, conditions 
are ideal for an analysis focusing on inter-burst periods, as our one (different, in this sense, from an 
alternative approach focusing on burst initiation as [42].) 

Several authors have studied the process of maturation of neuronal cultures, and characterized their 
spontaneous activity along days or weeks lT|[86] . Some studies have identified a stage of full maturation 
and stable bursting dynamics at the third week, a stage that can last for months (e.g. fSSl). However, 
depending on neuronal density p^[T8][2Q] , glial density 87 , and the gestation time of the embryo at the 
moment of dissectio n |14| , spontaneous activity with rich episodes of population bursts can emerge as 



early as DIV 3-4 |[T4}|20]): 

On the other hand, GABA switch (the developmental event after which the action of GABAergic 
synapses become inhibitory as in fully developed networks and stops being excitatory 88 ) in cultures 
similar to ours occurs at around day 7 [14^^ • Therefore, it has already taken place in early mature 
cultures at DIV 9 (such as the one analyzed in Fig. S7). This is confirmed by the fact that the blockade 
of inhibition by bicuculline leads to an increase of the amplitude of the fluorescent signal by a factor 2-3 
(see Figure S8B), as expected after GABA switch has occurred. 
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Non-local and moderately clustered connectivity in cultures 
Evidence for long-range connections 

Our TE-based algorithm applied to the reconstruction of the topology of neuronal cultures in vitro have 
inferred the existence of direct connections between neurons separated by a considerable distance. Indeed, 
the reconstructed distribution of connection distances peaks at a remarkably high value and is markedly 
high-skewed (Fig. |8^) 

Experimental studies showed that cortical slices have a maximum probability of connection at much 
shorter distances 61 . We note, however, that the density of cells in our culture is considerably more 
diluted than in normal cortical tissues. Furthermore, cortical developmental processes strongly dictate 
in vivo (and slice) connectivity 89 , while a larger freedom to establish connections exist potentially in 
cultures in vitro. The growth process and the final length of the processes depend not only on the density 
of the culture and the population of glia, but also on the chemical properties of the substrate where 
neurons and process grow 90 . The lack of restrictions during the development of neuronal processes 
(axons and dendrites) leads to longer axonal lengths in the culture on average, explaining the high 
connection distance obtained in the reconstruction. 

Such long axons in cultures have been reported in other studies [T5j[54][9l] , providing broad distribu- 
tions of lengths with an average value of 1.1 mm. Using Green Fluorescence Protein transfection 92] , 
we have directly visualized axonal processes as long as 1.8 mm in cultures of comparable age and density 
to the ones that we used for the fluorescence imaging recordings (Figure S8C). Such long axons would 
be required to mediate the long-range causal interactions captured by the TE analysis. As a matter of 
fact, as revealed by the randomization of the reconstructed connectivity, long average connection lengths 
might simply be expected to exist as a result of the bi-dimensional distribution of neurons on a substrate 
in combination with long individual axonal lengths (i.e. spanning the entire field of view). 

For a younger culture at 9 DIV (Fig. S7), on the other hand, we retrieved a probability of connection 
at short distances higher than expected from randomized networks. This over-connectivity at short 
distances can be ascribed to the fact that young cultures have less developed axons and therefore a 
connectivity favored towards closer neurons. However, more exhaustive studies — going beyond the scope 
of the present work — would be needed to assess the full dependence of reconstructed network topology 
on culture age, neuronal density and spatial distribution, as well as on connection length. 

Evidence for moderate clustering 

Our TE-based analysis suggest an enhanced tendency to clustering of connections in the analyzed neuronal 
cultures. Indeed, although the average clustering coefficient is moderated, it is significantly higher than 
what might be expected based on randomized networks. 

Neurons in culture aggregate during growth, giving rise to fluctuations in neuronal density. Since 
denser areas in the culture have been reported to have a higher connectivity 14 , some level of clustering 
is naturally expected in the real network, as detected by the reconstruction. We could not however 
identify in our reconstructed topology a statistically significant correlation between the local density of 
neurons — quantified by the number of cells within a radius of 50/im centered on each given neuron — 
and the local degree of clustering (r < 0.01 for both considered cultures). 

This finding suggests the existence of a mixture of local and non-local clustering in the culture and 
indicates that network clustering in a sparse culture is not a mere byproduct of the inhomogeneous density 
of cells, but might reflect activity-dependent mechanisms for synaptic wiring. 

A bias of cross-correlation methods? 

XC-based reconstructions inferred a much more local average distance of connection, significantly smaller 
not only than the inference based on TE, but also than what expected from randomized networks. At 
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the same time, XC-based methods inferred an average clustering coefficient almost twice as large as TE 
approaches. 

As a matter of fact, the analyses of Fig. [4] and |5] suggest that the connectivity of real cultures inferred 
by XC is prone to include artifactual features, such as an exceedingly local connectivity, paired to an 
overestimated level of clustering with respect to ground-truth (unknown in cultures). 

Interestingly (and possibly indicative of biased estimations) , the discrepancies between TE-based and 
XC-based inferences of mean connection distance and clustering level are paralleling both the systematic 
deviations between XC-bases and TE-based reconstructions identified in applications to surrogate data. 



No evidence for scale free connectivity 

Despite the observation that the distribution of degrees inferred by TE is characteristically broadened and 
has a conspicuous right tail, we have found no evidence of a scale free degree distribution (i.e. following 
a power-law 93 ). A similar conclusion was also reached for cultured neuronal networks in Ref. |94 . 



On the contrary, studies like 42 hinted at a large overlap in cultures between the structural connec- 
tivity and the retrieved scale free functional connectivity, meeting other studies that identified power-law 
degree distributions in the functional connectivity of cultures 46 or of slices 73 . 

Whether connectivity in neural networks, in vitro or in vivo can be characterized as scale free or 
not is a highly debated issue (see e.g. 95 ), and, especially for self-organized networks of dissociated 
neurons, it is likely to depend on details of how the culture is grown. The connectivity retrieved from our 
calcium recordings might possibly be better described as "small- world" 96 , due to the existence of hub 
nodes with very high out- and in-degree. However, due to the large uncertainty in the reconstruction, 
we did not attempt any systematic assessment of the average path length between nodes (since this is 
a quantity very sensitive to error), and prefer to describe the reconstructed degree distribution just as 
"right-skewed" . 



How reliable are absolute values of reconstructed properties? 

Through our reconstructions of culture connectivity, we have provided actual absolute values for properties 
such as average degree, average clustering or average connection length. We note that such estimates 
are affected by a large uncertainty that goes beyond the variability described by their reconstructed 
distributions. Indeed, these estimates come from a reconstruction based on a specific choice of the number 
of links to include. This choice, as highlighted by Figure SI for synthetic data, corresponds to selecting 
directly a specific value of the average culture in- and out-degree. Therefore, as previously discussed, the 
included link number should be determined by our expectations on the average degree of the network to 
reconstruct, based on independent experimental evidence or on extrinsic guiding hypotheses. 

The inference of average clustering or of average distance is robust even against relatively large 
mistakes in the initial guess for the average degree (see e.g. Figure SI, for clustering estimation). In the 
case of the DIV 12 network reconstructed in Fig. [sj doubling the threshold of included links from 5% to 
10% (and therefore adopting a twice as large guess for the average culture degree) changes the inference 
for the average clustering coefficient from 0.090 ±0.010 to 0.157 ±0.011 and the average connection 
distance from 852 ± 463/im to 855 ± 464/im. Nevertheless, the precise values obtained do depend on the 
number of reconstructed links. 

Due to the lack of information, a better strategy might be, rather than focusing on absolute estimated 
values, to focus on comparisons with fully and partially randomized networks with analogous average 
degree or degree distribution, respectively. Such comparisons indeed can convey qualitative information 
about the occurrence of non-trivial deviations from chance expectations, which are likely to be more 
reliable than quantitative assessments. 
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Conclusions and perspectives 

In summary, we have developed a new generalization of Transfer Entropy for inferring connectivity in 
neuronal networks based on fluorescence calcium imaging data. Our new formalism goes beyond previous 
approaches by introducing two key ingredients, namely the inclusion of same bin interactions and the 
separation of dynamical states through conditioning of the fluorescence signal. We have thoroughly tested 
our formalism in a number of simulated neuronal architectures, and later applied it to extract topological 
features of real, cultured cortical neurons. 

We expect that, in the future, algorithmic approaches to network reconstruction, and in particular our 
own method, will play a pivotal role in unravelling not only topological features of neuronal circuits, but 
also in providing a better understanding of the circuitry underlying neuronal function. These theoretical 
and numerical tools may well work side by side with new state-of-the-art techniques (such as optogenetics 



or high-speed two-photon imaging 97 IQlj ) that will enable direct large-scale reconstructions of living 



neuronal networks. Our Transfer Entropy formalism is highly versatile and could be applied to the 
analysis of in vivo voltage-sensitive dye recordings with virtually no modifications. 

On a shorter time-scale, it would be important to extend our analysis to the reconstruction of both 
excitatory and inhibitory connectivity in in vitro cultures, which is technically feasible, and to compare 
diverse network characteristics, such as neuronal density or aggregation. Our algorithm could be used 
to systematically reconstruct the connectivity of cultures at different development stages in the quest for 
understanding the switch from local to global neuronal dynamics. Another crucial open issue is to design 
suitable experimental protocols allowing to confirm the existence of at least some of the inferred synaptic 
links, in order to validate statistically the reconstructed connectivity. For instance, the actual presence 
of directed links to which our algorithm assigns the largest TE scores might be systematically probed 
through targeted paired electrophysiological stimulation and recording. Furthermore, GFP transfection 
or inmunostaining might be used to obtain actual, precise anatomical data on network architecture to 
be compared with the reconstructed one. Finally, it might be interesting to reconstruct connectivity 
of cultured networks before and after physical disconnection of different areas of the culture (e.g. by 
mechanical etching of the substrate or by chemical silencing). These manipulations would provide a 
scenario to verify whether TE-based reconstructions correctly capture the absence of direct connections 
between areas of the neuronal network which are known to be artificially segregated. 



Materials and Methods 
Network construction and topologies 

We generated synthetic networks with N — 100 neurons, distributed randomly over a squared area 



of 0.5mm lateral size. We chose p = 0.12 as the connection probability between neurons 55 , leading 
to sparse connectivities similar to those observed in local cortical circuits f62l. We used non-periodic 
boundary conditions to reproduce eventual "edge" effects that arise from the anisotropic cell density at 
the boundaries of the culture. 

We considered two general types of networks: (i) a locally- clustered ensemble^ where the probability 
of connection depended on the spatial distance between two neurons; and (ii) a non-locally clustered 
ensemble, with the connections engineered to display a certain degree of clustering. 

For the case of a non-local clustering ensemble, we first created a sparse connectivity matrix, randomly 
generating links with a homogeneous probability of connection across pairs of neurons. We next selected 
a random pairs of links and "crossed" them (links A ^ B and C ^ D became A ^ D and C ^ B). 
We accepted only those changes that updated the clustering index in the direction of a desired target 
value, thereby maintaining the number of incoming as well as outgoing connections of each neuron. The 
crossing process was iterated until a clustering index higher or equal to the target value was reached. The 
overall procedure led to a full clustering index of the reference random network of 0.120 ± 0.004 (mean 
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and standard deviation, respectively, across 6 networks). After the rewiring iterations, we then achieved 
standard deviations from the desired target clustering value smaller than 0.1% for all higher clustering 
indices. 

We measured the full clustering index of our directed networks according to a common definition 
introduced by |68 : 

(A I AT\3 

nn — I V ^ ni \ /-i \ 

The binary adjacency matrix is denoted by A, with Aji = 1 for a link j ^ and zero otherwise. The 
adjacent matrix provides a complete description of the network topological properties. For instance, the 
in-degree of a node i can be computed as df^ = Aji^ and the out-degree as d^^^ = Aij. The total 
number of links of a node is given by the sum of its in-degree and its out-degree = df^ + The 
number of bidirectional links of a given node i (i.e. links between i and j so that i and j are reciprocally 
connected by directed connections) is given by d^^^^"^ = (A)|^. 

The adjacency matrix did not contain diagonal entries. Such entries would correspond to"autaptic" 
links that connect a neuron with itself. Note that our directed functional connectivity analysis is based 
on bivariate time series, and therefore it would be structurally unfit to detect this type of links. 

For the case of the local clustering ensemble, two neurons separated a Euclidean distance r were 
randomly connected with a distance dependent probability described by a Gaussian distribution, of the 
form po{r) = exp(— (r/A)^), with A a characteristic length scale. To guarantee that a constant average 
number of links C was present in the network, this Gaussian distribution was rescaled by a constant 
pre-factor, obtained as follows. We first generated a network based on the unsealed kernel po{r) and 
computed the resulting number of links C^ With this value we then generated a final network based on 
the rescaled kernel p{r) — C/C exp(— (r/A^)). 

Simulation of the dynamics of cultured networks 



The dynamics of the generated neuronal networks was studied using the NEST simulator 102 103 . We 



modeled the neurons as leaky integrate-and-fire neurons, with the membrane potential Vi{i) of a neuron i 



described by il04[|l05 



dt gi 

where gi = 50pS is the leak conductance and = 20ms is the membrane time-constant. The term 
/syn accounts for a time-dependent input current that arises from recurrent synaptic connections. In the 
absence of synaptic inputs, the membrane potential relaxes exponentially to a resting level set arbitrarily 
to zero. Stimulation in the form of inputs from other neurons increase the membrane potential, and 
above the threshold Vthr — 20mV an action potential is elicited {neuronal firing) . The membrane voltage 
is then reset to zero for a refractory period of tref = 2ms. 

The generated action potential excites post-synaptic target neurons. The total synaptic currents are 
then described by 

= - + «int E E ^^-^ ^^•^(^) - - + ^ext E - 4t,. - td). (3) 

J = l k I 

where A is the adjacency matrix, and = 2ms is a synaptic time constant. The resulting excitatory 
post-synaptic potentials (EPSPs) have a standard difference-of-exponentials time-course [106) . 

Neurons in culture show a rich spontaneous activity that originates from both fiuctuations in the 
membrane potential and small currents in the pre-synaptic terminals {minis). The latter is the most 
important source of noise and plays a pivotal role in the generation and maintenance of spontaneous 
activity fl07'. To introduce the spontaneous firing of neurons in Eq. ([3|, each neuron i was driven. 
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through a static couphng conductance with strength aext = 4.0pA, by independent Poisson spike trains 
(with a stationary firing rate of v^xt = 1.6Hz, spikes fired at stochastic times {^^t *})• 

Neurons were connected via synapses with short-term depression, due to the finite amount of synap- 
tic resources 105 . We considered only purely excitatory networks to mimic the experimental condi- 
tions in which inhibitory transmission is fully blocked. Concerning the recurrent input to neuron z, 
the set {tj} represents times of spikes emitted by a presynaptic neuron j, td is a conduction delay 
of td = 2ms, while aint sets a homogeneous scale for the synaptic weights of recurrent connections, whose 
time-dependent strength ai^tEji{t) depends on network firing history through the equations 

^ = -|f; + ^%E^(^-^.^)' (4) 

k 

(1 - Rji - Eji) . (5) 



dRji{t) _ 1 
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In these equations, Eji(t) represents the fraction of neurotransmitters in the "effective state", Rji{t) in 
the "recovered state" and Iji{t) = 1 — Rji — Eji in the "inactive state" 104, 105 . Once a pre-synaptic 



action potential is elicited, a fraction U = 0.3 of the neurotransmitters in the recovered state enters the 
effective state, which is proportional to the synaptic current. This fraction decays exponentially towards 
the inactive state with a time scale rinact = 3ms, from which it recovers with a time scale Tree = 500ms. 
Hence, repeated firing of the presynaptic cell in an interval shorter than Tree gradually reduces the 
amplitude of the evoked EPSPs as the synapse is experiencing fatigue effects (depression). 

Random networks of integrate- and- fire neurons coupled by depressing synapses are well-known to 
naturally generate synchronous events 105 , comparable to the all-or-none behavior that is observed in 
cultured neurons 46 47 . To obtain in our model a realistic bursting rate 47 , the synaptic weight of 
internal connections was set to result into a network bursting of O.lOiO.OlHz for all the network realiza- 
tions that we studied, and in particular for any considered (local or non-local) clustering level. Therefore, 
after having generated each network topology, we assigned the arbitrary initial value of aint = 5. Op A 
to internal synaptic weights and simulated 200 seconds of network dynamics, evaluating the resulting 
average bursting rate. If it was larger (smaller) than the target bursting rate, then the synaptic weight 
aint was reduced (increased) by 10%. We then iteratively adjusted aint by (linearly) extrapolating the 
last two simulation results towards the target bursting rate, until the result was closer than 0.01 Hz to 
the target value. The resulting used values of aint are provided in Table [l] 

Note that we defined a network burst to occur when more than 40% of the neurons in the network 
were active within a time window of 50ms. Such a criterion does not play any role in the reconstruction 
algorithm itself, where state selection is achieved through conditioning, but is only used for the automated 
generation of random networks with a prescribed bursting rate. Typically, for a fully developed burst, 
more than 90% of the neurons fire within a 50 ms bin, while, during inter-burst intervals, less than 
10% do. Due to the clear separation between these two regimes, our burst detection procedure does not 
depend significantly on the precise choice of threshold within a broad interval. 



Model of calcium fluorescence signals 

To reproduce the fluorescence signal measured experimentally, we treated the simulated spiking dynamics 
to generate surrogate calcium fluorescence signals. We used a common model introduced in 43 that gives 
rise to an initial fast increase of fluorescence after activation, followed by a slow decay (rca = Is). Such 
a model describes the intra-cellular concentration of calcium that is bound to the fluorescent probe. The 
concentration changes rapidly by a step amount of Aca = 50/iM for each action potential that the cell is 
eliciting in a time step t, of the form 

[Ca2+], - [Ca2+]t_i = - — [C8?+]t-i + Ac.nt, (6) 
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where rit is the total number of action potentials. 

The net fluorescence level F associated to the activity of a neuron i is finally obtained by further feeding 
the Calcium concentration into a saturating static non-linearity, and by adding a Gaussian distributed 
noise r]t with zero mean: 



■m- (7) 



For the simulations, we used a saturation concentration of = 300/iM and noise with a standard 
deviation of 0.03. 



Modeling of light scattering 

We considered the light scattered in a simulated region of interest (ROI) from surrounding cells. Denoting 
as dij the distance between two neurons i and j and by Xgc = 0.15 mm the scattering length scale 
(determined by the typical light deflection in the medium and the optical apparatus), the resulting 
fluorescence amplitude of a given neuron F^^ is given by 

N 

Fl'^ = F,^t ^ Asc Fj^texp[-{d,j/Xscf]. (8) 

A sketch illustrating the radius of influence of the light scattering phenomenon is given in Fig. S9. The 
scaling factor Asc sets the overall strength of the simulated scattering artifact. Note that light scattered, 
according to the equation shown above, could be completely corrected using a standard deconvolution 
algorithm, at least for very large flelds of view and a scattering length known with suflicient accuracy. 
In a real setup however, the relatively small flelds of view (on the order of 3 mm^), the inaccuracies 
in inferring the scattering radius A^c, as well as the inhomogeneities in the medium and on the optical 
system, make perfect deconvolution not possible. Therefore, artifacts due to light scattering cannot be 



completely eliminated 108,109 . The scaling factor A^c, that we arbitrarily assumed to be small and 



with value = 0.15, can be seen as a measure of this residual artifact component. 



Generalized Transfer Entropy 

In its original formulation [26j , for two discrete Markov processes X and Y (here shown for equal Markov 
order /c), the Transfer Entropy (TE) from Y to X was deflned as: 

TEy^x = ^P(a:„+i,x«,yW) log (9) 

^ P{Xn+l\x):') 

(k) 

where n is a discrete time index and Xn ^ is a vector of length k whose entries are the samples of X at 
the time steps n, n — 1, n — /c. The sum goes over all possible values of x^+i, xi!^^ and y^^ . 

TE can be seen as the distance in probability space (known as the Kullback-Leibler divergence [110| ) 
between the "single node" transition matrix P(xn+i and the "two nodes" transition matrix P(xn+i \x\ 
As expected from a distance measure, TE is zero if and only if the two transition matrices are identical, 
i.e. if transitions of X do not depend statistically on past values of F, and is greater than zero otherwise, 
signaling dependence of the transition dynamics of X on Y. 

We use TE to evaluate the directed functional connectivity between different network nodes. In 
a pre-processing step, we apply a basic discrete differentiation operator to calcium fluorescence time 
series F^^^, as a rather crude way to isolate potential spike events. Thus, given a network node x, we 
deflne Xn = PJ^n+i ~ ^l,n- This pre-processing step also improves the signal-to-noise ratio, thus allowing 
for a better sampling of probability distributions with a limited number of data points. To adapt TE 
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to our particular problem we need to take into account the general characteristics of the system. We 
therefore modified TE in two crucial aspects: 

1. We take into account that the synaptic time constants of the neuronal network (~1 ms) are much 
shorter than the acquisition times of the recording (~10 ms). We therefore need to account for 
"same bin" causal interactions between nodes, i.e. between events that fall in the same time-bin. 
Slower interactions with longer lags are still taken into account by evaluating TE for a Markov 
order larger than one (in time-bins units). 

2. We consider the possibility that the network dynamics switches between multiple dynamical states, 
i.e. between bursting and inter-bursting regimes. These regimes are characterized by different mean 
rates of activity and, potentially, by different transition matrices. Hence, we have to restrict the 
evaluation of TE to time ranges in which the network is consistently in a single dynamical state. 
The separation of dynamical states can be achieved by introducing a variable gt for the average 
signal of the whole network. 



N 
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We then include all data points at time instants in which this average fiuorescence gt is below a 
predefined threshold parameter ^, i.e. we consider only the time points that fulfill {t : gt < g}. 
We only make an exception that corresponds to the simulations of figures |3] and S2, where we 
considered time points that fall within an interval bounded by a higher and a lower thresholds, i.e. 
{t : ^low < gt < ^high}. 

Using these two novel aspects, we have extended the original description of Transfer Entropy [Eq. ([9|] 
to the following form 

^^Y^x{9) = 2_^P{xn+i,x)^\y\l^\gn+i <g) log -j^ ^ . (11) 

P(Xn+l|x^ ,^n+l < g) 

Probability distributions have to be evaluated as discrete histograms. Hence, the continuous range 
of fiuorescence values (see e.g. the bottom panels of Fig. [T]) is quantized into a finite number B of 
discrete levels. We typically used a small 5 = 3, a value that we justify based on the observation that 
the resulting bin width h is close to twice the standard deviation of the signal. The presence of large 
fiuctuations, most likely associated to spiking events, is then still captured by such a coarse, almost 
non-parametric description of fiuorescence levels. 



Network reconstruction 

Generalized TE values are obtained for every possible directed pair of network nodes, and using a fixed 
threshold level g. The set of TE scores are then ranked in ascending order and scaled to fall in the 
unit range. A threshold TE^^^ is then applied to the rescaled data, so that only those links with scores 
above TEthr are retained in the reconstructed network. 

A standard Receiver- Operator Characteristic (ROC) analysis is used to assess the quality of the 
reconstruction by evaluating the number of true positives (reconstructed links that are present in the 
actual network) or false positives (not present), and for different threshold values TEthr |111 • The 
highest threshold value leads to zero reconstructed links and therefore zero true positives and false 
positives. At the other extreme, the lowest threshold provides both 100% of true positives and false 
positives. Intermediate thresholds give rise to a smooth curve of true/false positives as a function of the 
threshold. The performance of the reconstruction is then measured as the degree of deviation of this 
curve from the diagonal, and that corresponds to a random choice of connections between neurons. 
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To provide a simple method to compare different reconstructions, we arbitrarily use the quan- 
tity TPio%, defined as the fraction of true positives for a 10% of false positives, as indicator for the 
quality of the reconstruction. 

An alternative to the ROC is the Positive Prediction Curve [48], plotting the "true-false ratio" (TFR) 
against the number of reconstructed links, called "true-false sum" (TFS). The TFR represents the fraction 
of true positives relative to the false positives. Denoting by #TP the absolute number of true positives 
and by #FP the number of false positives, TFR is therefore defined in 48 in the following way: 

The case TFR = corresponds to the case that, for any given reconstructed link, it is on average equally 
likely that it is in fact a true positive rather than a false positive. 

Alternative reconstruction methods 

To gain further insight into the quality of our reconstruction method, we compare reconstructions based 
on TE with three other reconstruction strategies, namely cross-correlation, mutual information, and 
Granger causality. 

Cross-correlation (XC) reconstructions are based on standard Pearson cross-correlation. The score 
assigned to each potential link is given by the largest cross-correlogram peak for lags between and 
tmax = 60ms, of the form 

XCy^x = ^^i^ax jcorr ^^-aI^^) Ids < g] • (13) 

In a similar way, the scores for Mutual Information (MI) reconstructions are evaluated as 

MIy^x = ^ max <yP{xn,yn-At\gn < g) log ^7 — . . . — ^ \ . 14 

^^=o--^— P{xn\gn < g) P{yn-At\gn < g) J 

Analogously to TE, the sum goes over all entries of the joint probability matrix. 

For the reconstruction based on Granger causality (GC) we first model the signal Xt by least- 
squares fitting of a univariate autoregressive model, obtaining the coefficients and the residual 77^, 

k 

Xt = ^a^Xt-i^Vt' (15) 

In a second step, we fit a second bivariate autoregressive model that includes the potential source signal yt, 
and determine the residual 77^, 

k k-1 

="^^1 ^t-l + X] Vt-m + Vl' (16) 
1=1 m=0 

Note that in the latter bivariate regression scheme we take into account "same bin" interactions as for 
Transfer Entropy (index of the second sum starts at m = 0). Given F^, the covariance matrix of the 



univariate fit in Eq. (15), and F^, the covariance matrix of the bivariate fit in Eq. (16), GC is then given 



by the logarithm of the ratio between their traces: 
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GC analyses were performed at an order k = 2. Analyses at /c = 1 yielded however fully analogous 
performance (not shown). 

We note that the same pre-processing used for TE is also adopted for all the other analyses. The 
same holds for conditioning on the value of the average fluorescence which can be applied simply by 
only including the subset of samples in which gt < 9- 



Hubs of (causal) connectivity 

Connectivity in reconstructed networks is often inhomogeneous, and groups of nodes with tighter internal 
connectivity are sometimes visually apparent (see e.g. reconstructed topologies in Figure [Sp). We do not 
attempt a systematic reconstruction of network communities 112 , but we limit ourselves to the detection 



of "causal sink" nodes 113 , which have a larger than average in-degree. We define this property in terms 
of the sum of TE from all other nodes to one particular node (^^. TE^^^), choosing the top 20 nodes for 
each particular network as selected "hub nodes" . 

We then analyze the dynamics of these selected hub nodes and of their neighbors. Specifically we 
define as C the subgraph spanned by a given hub node and by its first neighbors. We analyze then the 
cross-correlogram of the average fluorescence of a given group C with the average fluorescence of the 
whole culture: 

V'(r) = corr (A(F-(t + r)),,c, A(F-(t))i) . (18) 

The A-notation indicates that we correlate discretely differentiated average fluorescence time series, rather 
than the average time series themselves. Indeed, cross-correlograms for these differentiated time series 
are well modeled by a Gaussian functional form, due to the slow change of the averaged fluorescence 
compared to the sampling rate. 

Therefore, we fit a Gaussian to the cross-correlogram V^(r): 

V/,t(T) = ^cexp|-(^^^) |, (19) 

determining thus a cross-correlation amplitude Aq^ a cross-correlation peak lag rc and the standard 
deviation ac- 

The cross-correlation peak lag rc indicates therefore whether nodes in a given local hub neighborhood 
C fire on average earlier or later than other neurons in the network. 

Relative strength of synchrony within a local hub neighborhood C can be analogously evaluated by 



computing XCs, as defined in Eq. (13), for all the links within C and comparing it with peak XCs over 
the entire network. 



Experimental preparation 

Primary cultures of cortical neurons were prepared following standard procedures p^|114| . Cortices were 
dissected from Sprague-Dawley embryonic rat brains at 19 days of development, and neurons dissociated 
by mechanical trituration. Neurons were plated onto 13 mm glass cover slips (Marienfeld, Germany) pre- 
viously coated overnight with 0.01% Poly-l-lysine (Sigma) to facilitate cell adhesion. Neuronal cultures 
were incubated at 37 °C, 95% humidity, and 5% CO2 for 5 days in plating medium, consisting of 90% 
Eagle's MEM — supplemented with 0.6% glucose, 1% lOOX glutamax (Gibco), and 20 /ig/ml gentamicin 
(Sigma) — with 5% heat-inactivated horse serum (Invitrogen) , 5% heat-inactivated fetal calf serum (Invit- 
rogen), and 1 /ig/ml B27 (Invitrogen). The medium was next switched to changing medium, consisting of 
of 90% supplemented MEM, 9.5% heat-inactivated horse serum, and 0.5% FUDR (5-fluoro-deoxy-uridine) 
for 3 days to limit glia growth, and thereafter to final medium, consisting of 90% supplemented MEM 
and 10% heat-inactivated horse serum. The final medium was refreshed every 3 days by replacing the 
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entire culture well volume. Typical neuronal densities (measured at the end of the experiments) ranged 
between 500 and 700 neurons/mm^. Cultures prepared in these conditions develop connections within 
24 hours and show spontaneous activity by day in vitro (DIV) 3-4 |14,J^, 20 . GAB A switch, the change 



14 20 



of GABAergic response from excitatory to inhibitory, occurs at DIV 6-7 

Neuronal activity was studied at day in vitro (DIV) 9-12. Prior to imaging, cultures were incubated 
for 60 min in pH-stable recording medium in the presence of 0.4% of the cell-permeant calcium sensitive 
dye Fluo-4-AM (Invitrogen) . Recording solution includes (in mM): 128 NaCl, 4 KCl, 1 CaCl2, 1 MgCls, 
10 glucose, and 10 HEPES; pH titrated to 7.4 and osmolarity to 320 mOsm with 45 mM sucrose. The 
culture was washed off Fluo-4 after incubation and finally placed in a chamber filled with fresh recording 
medium. The chamber was mounted on a Zeiss inverted microscope equipped with a 5X objective and a 
0.4X optical zoom. 

Neuronal activity was monitored through high-speed fiuorescence imaging using a Hamamatsu Orca 
Flash 2.8 CMOS camera attached to the microscope. Images were acquired at a speed of 100 frames/s 
(i.e. 10 ms between two consecutive frames), which were later converted to a 20 ms resolution using a 
sliding window average to match the typical temporal resolution of such recordings 13 , 43 ]45|[9H|115| . 
The recorded images had a size of 648 x 312 pixels with 256 grey-scale levels, and a final spatial resolution 
of 3.4 /im/pixel. This settings provided a final field of view of 2.2 x 1.1 mm^ that contained on the order 
of 2000 neurons. Activity was finally recorded as a long image sequence of 60 minutes in duration. We 
verified that the fiuorescence signal remained stable during the recording, as shown in Supplementary 
Figure S8A, and we did not observe neither photo-bleaching of the calcium probe nor photo-damage of 
the neurons. 

Before the beginning of the experiment, inhibitory synapses were fully blocked with 40 /jM bicu- 
culline, a GABA^ antagonist, so that activity was solely driven by excitatory neurons. Since cultures 
were studied after GABA switch, the blockade of inhibition resulted in an increase of the fiuorescence 
amplitude, which facilitated the detection of neuronal firing, as illustrated in Fig. S8B. 

The image sequence was analyzed at the end of the experiment to identify all active neurons, which 
were marked as regions of interest (ROIs) on the images. The average grey-level on each ROI along the 
complete sequence finally provided, for each neuron, the fiuorescence intensity as a function of time. Each 
sequence typically contained on the order of a hundred bursts. Examples of recorded fiuorescence signal 
for individual neurons are shown in Fig. [l^. 



Analysis of experimental recordings 

The fiuorescence data obtained from recordings of neuronal cultures was analyzed following exactly the 
same procedures used for simulated data (e.g. processed in a pipeline including discrete differentiation, 
TE or other metrics evaluation, ranking, and final thresholding to maintain the top 10% of connections). 

Due to the lack of knowledge of the ground-truth topology, optimal conditioning level cannot be 
known. However, based on the similarity between experimental and simulated distributions of calcium 
fiuorescence, we select a conditioning level such to exclude the high fiuorescence transients associated 
to fully-developed bursting transients while keeping as many data points as possible. Concretely, this is 
achieved by taking a conditioning level equal to approximately two standard deviations above the mean of 
a Gaussian fit to the left peak of the fiuorescence histogram. Such a level coincides with the point where, 
when gradually increasing the conditioning level, the reconstructed clustering index reaches a plateau, 
i.e. matches indicatively the upper limit of range II in Figure |3] 

To check for robustness of our reconstruction, we generated alternative reconstructions based on 
different conditioning levels. For the selected conditioning value, and for both the experimental datasets 
analyzed (Figs, [s] and S7), we verified that the inferred topological features, including notably the average 
clustering coefficient and connection distance, were stable in a range centered on the selected conditioning 
value and wide as much as approximately two standard deviations of the fluorescence distribution. 



30 



To identify statistically significant non-random features of the real cultured networks in exam, we 
compared the reconstructed topology to two randomizations. 

A first one consisted in a complete randomization that preserved only the total number of connections 
in the network, but scrambled completely the source and target nodes. The resulting random ensemble 
of graphs was an Erdos-Renyi ensemble (see, e.g. f65^) in which each possible link exists with a uniform 
probability of connection p = C/{N{N — 1)), where C is the total number of connections in the reference 
reconstructed network. 

A second partial randomization preserved the in-degree distributions only, and was implemented by 
shuffling the entries of each row of the reconstructed adjacency matrix, internally row- by-row. In this way, 
the out-degrees of each node were preserved. In both randomization processes, we disallowed diagonal 
entries. 

For both randomizations we calculated the in-degree, the distance of connections and the full clustering 
index for each node, leading to distributions of network topology features that could be compared between 
the reconstructed network and the randomized ensembles, to identify significant deviations from random 
expectancy. 
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Figure Legends 

Figure 1. Network activity in simulation and experiments. A Bright field image (left panel) of 
a region of a neuronal culture at day in vitro 12, togther with its corresponding fluorescence image (right 
panel), integrated over 200 frames. Round objects are cell bodies of neurons. B Examples of real (left) and 
simulated (right) calcium fluorescence time series, vertically shifted for clarity. C Corresponding averages 
over the whole population of neurons. D Distribution of population averaged fluorescence amplitude for 
the complete time series, for a real network (left) and a simulated one (right). 



Figure 2. Independence of network dynamics from clustering coefficient. A Examples of 
spike raster plots for three networks with different clustering coefficients (non-local clustering ensemble), 
showing that their underlying dynamics are similar. B Histograms of the inter-burst intervals (IBIs), with 
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the vertical lines indicating the mean of each distribution. The insets illustrate the amount of clustering 
by showing the connectivity of simple networks that have the same clustering coefficients as the simulated 
ones. 

Figure 3. Dependence of the directed functional connectivity on the dynamical state. A The 

distribution of averaged fluorescence amplitudes is divided into seven fluorescence amplitude ranges. The 
functional connectivity associated to different dynamical regimes is then assessed by focusing the analysis 
on specific amplitude ranges. B Quality of reconstruction as a function of the average fiuorescence 
amplitude of each range. The blue line corresponds to an analysis carried out using the entire data 
sampled within each interval, while the red line corresponds to an identical number of data points per 
interval. C Visual representation of the reconstructed network topology (top 10% of the links only), 
together with the corresponding ROC curves, for the seven dynamical regimes studied. Edges marked in 
green are present in both the reconstructed and the real topology, while edges marked in red do not match 
any actual structural link. Reconstructions are based on an equal number of data points in each interval, 
therefore refiecting the equal sample size performance (red curve) in panel B. Interval I corresponds 
to a noise-dominated regime; intervals II to IV correspond to inter-burst intervals with intermediate 
firing rate and provide the best reconstruction; and intervals V-VII correspond to network bursts with 
highly synchronized neuronal activity. Simulations were carried out on a network with local topology 
(A = 0.25 mm) and light scattering in the fluorescence dynamics. The results were averaged over 6 
network realizations, with the error bars in B and the shaded regions in C indicating a 95% confidence 
interval. 

Figure 4. TE-based network reconstruction of non-locally clustered topologies. A ROC curve 
for a network reconstruction with generalized TE of Markov order A: = 2, and with fluorescence data con- 
ditioned at g < g = 0.112. The shaded area depicts the 95% confldence interval, and is based on 6 
networks. B Comparison between structural (shown in blue) and reconstructed (red) network properties: 
clustering coefficients (top), degree distribution (center), and distance of connections (bottom). C Recon- 
structed clustering coefficients as a function of the structural ones for different reconstruction methods. 
Non-linear causality measures, namely Mutual Information (MI, red) and generalized Transfer Entropy 
(TE, yellow), provide the best agreement, while a linear reconstruction method such as cross-correlation 
(XC, blue) fails, leading invariantly to an overestimated level of clustering. The error bars indicate 95% 
confidence intervals based on 3 networks for each considered clustering level. All network realizations 
were constructed with a clustering index of 0.5, and simulated including light scattering artifacts in the 
fluorescence signal. 

Figure 5. TE-based network reconstruction of locally-clustered topologies. A ROC curve for 
a network reconstruction with generalized TE Markov order A: = 2, with fluorescence data conditioned at 
g < g = 0.084. The shaded area depicts the 95% confidence interval based on 6 networks. B Comparison 
between structural (top) and reconstructed (bottom) connectivity. For the reconstructed network (after 
thresholding to retain the top 10% of links only) true positives are indicated in green, and false positives 
in red. C Comparison between structural (blue) and reconstructed (red) network properties: clustering 
coefficients (top), degree distribution (center) and distance of connections (bottom). D Reconstructed 
length scales as a function of the structural ones for different reconstruction methods. The non- linear 
causality measures Mutual Information (MI, red) and generalized Transfer Entropy (TE, yellow) provide 
good reconstructions, while the linear cross-correlation (XC, blue) provides invariantly an underestimated 
length scale. The error bars indicate 95% confidence intervals based on 3 networks per each considered 
length scale. All network realizations were constructed with a characteristic length scale A = 0.25 mm. 
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and simulations included light scattering artifacts. 

Figure 6. Dependence of performance level on network clustering, conditioning level and 
light scattering artifacts. The color panels show the overall reconstruction performance level, quan- 
tified by TPio% (black, 0% true positives; white, 100% true positives), for different target ground-truth 
clustering coefficients and as a function of the used conditioning level. Five different reconstruction 
algorithms are compared: cross-correlation (XC), Granger Causality (GC) with order A: = 2, Mutual 
Information (MI), and Transfer Entropy (TE) with Markov orders k = 1,2. The top row corresponds 
to simulations without artifacts, and the bottom row to simulations including light scattering. Recon- 
structions with linear methods are performant only in the absence of light scattering artifacts. TE 
reconstruction with k = 2 shows the best overall reconstruction performance, even with light scatter- 
ing and for any target clustering coefficient. An optimal reconstruction is obtained in a narrow range 
surrounding the conditioning value of ^ 0.2. 

Figure 7. Dependence of reconstruction quality on TE formulations and recording length. 

A ROC curves for network reconstructions of non- locally clustered (left panel) and locally-clustered 
topologies (right), based on three TE formulations: conventional TE (blue), generalized TE with same 
bin interactions only (red) or including also optimal conditioning (yellow). The vertical lines indicate the 
performance level at TPio%, and provide a visual guide to compare the quality of reconstruction between 
different formulations. B Decay of the reconstruction quality as measured by TPio% for the two topology 
ensembles and for generalized TE with conditioning, as a function of the data sampling fraction s. A full 
simulated recording of Ih in duration provides a data set of length S'lh, corresponding to a data sampling 
fraction of s = 1. Shorter recording lengths are obtained by shortening the full length time-series at a 
shorter length given by = 6'ih/s, with 5 = 1, 19. The insets show the same results but plotted as a 
function of S' in semi- logarithmic scale. For both A and B, the panels in the left column correspond to 
the non-locally clustered ensembles (cfr. Fig. [4|, while the panels in the right column correspond to the 
locally-clustered ensemble (cfr. Fig. [5|. Shaded regions and error bars indicate 95% confidence intervals 
based on 6 networks. 

Figure 8. Network reconstruction of an in vitro neuronal culture at DIV 12. A Example 
of TE reconstructed connectivity in a subset of 49 neurons (identified by black dots) in a culture with 
N = 1720 marked neurons (regions of interest) in the field of view, studied at day in vitro 12. Only the 
top 5% of connections are retained in order to achieve, in the final reconstructed network, an average 
connection degree of 100 (see Results). B Properties of the network inferred from TE reconstruction 
method (top panels) compared to a cross-correlation (XC) analysis (bottom panels). The figure shows 
reconstructed distributions for the in-degree (left column), the connection distance (middle column), and 
the clustering coefficient (right column). In addition to the actual reconstructed histograms (yellow), 
distributions for randomized networks are also shown. Blue color refers to complete randomizations that 
preserves the total number of connections only, and red color to partial randomizations that shuffle only 
the target connections of each neuron in the reconstructed network. 

Supporting Figure Legends 

Supplementary Figure 1: Dependence of reconstructed degree and clustering coefficient on 
the fraction of included links. We report here analyses for three representative non-locally 
clustered networks with clustering coefficients 0.1, 0.3 and 0.5 (as in Fig. [2]). A The average 
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degree, by construction, varies linearly with the fraction of included links (or, shortly, threshold). B 
The reconstructed clustering coefficient show a steep rise at low threshold values, and describe broad 
hill-like profiles between approximately 5% and 10% of reconstructed links, which peaks around CC 
values matching approximately the ground truth clustering value of the network. For higher values of 
the threshold, the dependency of the reconstructed clustering coefficient against threshold becomes closer 
to the linear. Note that by definition of the "full" clustering coefficient, here measured (see Methods 
section), a random network would show a perfect linear correlation. 

Supplementary Figure 2: Hubs of functional connectivity. A Plot of the spatial position of neu- 
rons of an example simulated network, highlighting functional connectivity hubs (red dots) for each of the 
dynamic intervals I- VII depicted in Fig.[3j B For the corresponding dynamical regimes, the average cross- 
correlation between calcium signals of first-neighbor nodes of functional connectivity hubs (blue bars, see 
Materials and Methods) is compared to the average cross-correlation between nodes inside the latter local 
groups and the rest of the population (green). Significance analysis is carried out using a Mann- Whitney 
test (n.s.: not significant). The results show that the degree of synchronization within the neighborhood 
of functional connectivity hubs is significantly higher than across groups. C Crosscorrelogram time-shift 
of the average activity of the same local groups with respect to the whole network average (see Methods, 
cfr. also Fig. S3). Negative shifts denote "bursting earlier" and positive shifts "bursting later". The 
analysis shows that the calcium fluorescence signal in the neighborhood of functional connectivity hubs 
for the dynamic ranges II-IV is "leading", i.e. has a negative lag in relation to the average population. 
Such functional connectivity hubs correspond therefore to foci of burst initiation. 

Supplementary Figure 3: State dependency in simulated and real data. Two experiments 
are considered, "A" being the one at DIV 12 (Fig. [8| and "B" corresponding to the one at DIV 9 
(Fig. S7), together with a simulated data set, corresponding to the network from the local clustering 
ensemble (studied in Fig. [5|. For the three cases we separate four ranges of fluorescence signal and 
identify accordingly four dynamical regimes (encoded by different colors, top row). We compute then, 
in each one of these regimes, the clustering index (central row) and the average connection distance 
(bottom row). While the exact reconstructed values of the clustering index and the average connection 
distance are of course different, all three datasets share the main qualitative features, like the fact that 
the reconstructed clustering index is peaked in range "III" (marked in yellow in the top panels) where 
their average connection distance is lowest. 

Supplementary Figure 4: Analysis of additional topological features. A In non-locally clustered 
topologies, the non-linear causality measures Mutual Information (MI, red) and Transfer Entropy (TE, 
yellow) correctly estimate the average connection distance, while a linear measure such as cross-correlation 
(XC, blue) fail to do so, invariantly under-estimating this distance, as for the locally clustered ensemble 
(cfr. Fig. (Sp). Note that non-locally clustered topologies are random by construction in terms of the 
spatial distribution of connections, and, therefore, they display a virtually identical average connection 
length, independent of the clustering coefficient. B In locally-clustered topologies, the non-linear measures 
MI (red) and TE (yellow) also provide a good estimation of the resulting clustering coefficient, in contrast 
with cross-correlation, XC (blue), invariantly overestimating the clustering level, as for the non-locally 
clustered ensemble (cfr. Fig. [4|l)). In both plots, the dashed line corresponds to perfect reconstruction. 

Supplementary Figure 5: Dependence of performance on characteristic length scale, con- 
ditioning level and light scattering. The color panels show the overall reconstruction performance 
level, quantified by TPio% (black, 0% true positives; white, 100% true positives), for different target 
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ground-truth clustering coefficients and as a function of the used conditioning level. Five different recon- 
struction algorithms are compared: cross-correlation (XC), Granger Causality (GC) with order /c = 2, 
Mutual Information (MI), and Transfer Entropy (TE) with Markov orders k = 1,2. The top row cor- 
responds to simulations without artifacts, and the bottom row to simulations including light scattering. 
TE of order k = 2 and with light scattering provides a fair reconstruction quality at any length scale for 
a conditioning value g 0.1. (Note that the scale bar of reconstruction performance is different from the 
one in Fig. |6]) 

Supplementary Figure 6: Positive precision curve (PPC) analysis. This figure provide an 
alternative description of the reconstruction performances as a function of the inclusion of different 
components of the algorithm, studied in Fig. [TJA.. A definition of positive prevision curves is provided 
provided in Materials and Methods, following Ref. [48 . The plots show the "true-false ratio" (TFR) as 
a function of the "true-false sum" (TFS), i.e. the number of reconstructed links, and describe indirectly 
the likelihood that an included link is actual true positive. Positive values correspond to a larger number 
of true than false positives among the reconstructed links. Similarly to Fig. [7|^, only generalized TE 
including both same-bin interactions and optimal conditioning robustly displays positive values of TFR 
in a broad range of TFS. Note that a threshold of 10% top links (used for example at the bottom panel 
of Fig. [5^), provides a TFS value of 990. Shaded areas are 95% confidence intervals across 6 networks. 

Supplementary Figure 7: Network reconstruction of an in vitro neuronal culture at DIV 9. 

A Detail of example fluorescence time series of individual neurons (top panel) and population average 
of fluorescence (bottom panel). Both synchronous bursts and inter-burst modulations of the fluorescence 
baseline are easily detectable in these recordings. B Example of TE reconstructed connectivity in a subset 
of 39 neurons (identified by black dots) in a culture with N = 1668 marked neurons (regions of interest) 
in the field of view, studied at day in vitro 9. Only the top 5% of connections are shown. C Properties of 
the network inferred from TE reconstruction method (top panels) compared to a cross-correlation (XC) 
analysis (bottom panels). The figure shows reconstructed distributions for the in-degree (left column), 
the connection distance (middle column), and the clustering coefficient (right column). In addition to 
the actual reconstructed histograms (yellow), distributions for randomized networks are also shown. Blue 
color refers to complete randomizations that preserves the total number of connections only, and red color 
to partial randomizations that shuffle only the target connections of each neuron in the reconstructed 
network. 



Supplementary Figure 8: Stability, development and axon length of neuronal cultures. 

A Typical calcium fluorescence recording of neuronal activity at DIV 12 along 1 hour, showing the 
stability of the fluorescence signal (blue) and the rich network activity. B Comparison between the 
fluorescence signal before (bottom) and after (top) pharmacological blocking of GABAergic transmission 
at DIV 9. The increase in burst amplitude after blocking of GABA confirms that GABA switch has 
already occurred. C Direct visualization of GFP-transfected neurons at DIV 12. The highlighted axons 
are between 1 and 2 mm in length and reveal the existence of long range connections in early mature 
cultures. 

Supplementary Figure 9: Schematic representation of light scattering radius. Light scattering 
artifacts are expected to affect fluorescence imaging data when neurons are sufficiently close to each 
other. In this example, neurons are indicated as black dots. For three particular neurons, the radius of 
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the simulated light scattering artifact Age = 0.15 mm is shown as green circular area centered on each of 
these neurons. 

Tables 

Table 1. Synaptic weights used in the simulation. Mean and standard deviation for the internal 
synaptic weights aint, used in the simulation of 6 networks with a non-locally clustered ensemble (listed 
with ascending clustering coefficients CC, and 6 networks with a locally- clustered ensemble (listed by 
ascending length scales A). 
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0.230 




1.25 


6.505 


0.158 




1.5 


6.519 


0.113 



ures and Supporting figures 





experiment 



CD 65 




20 40 60 

time (s) 




1000 




occurrences 

Ol o o 
o o o 




o 




^ 10 




5 





51.0 51.5 52.0 52.5 
fluorescence (a.u.) 



0.0 



0.1 0.2 0.3 0.4 
fluorescence (a.u.) 



0.5 



Figure 1 



43 



CC = 0.1 




10 20 30 40 50 60 70 



inter-burst interval (s) 



CC = 0.3 




10 20 30 40 50 60 70 



inter-burst interval (s) 
Figure 2 



CC = 0.5 




10 20 30 40 50 60 70 



inter-burst interval (s) 



44 




Figure 3 




Figure 4 




Figure 5 



47 




Figure 6 




Figure 7 



49 




■ full randomiz 

■ partial randomiz. 

■ reconstruction 





I 






1 











100 200 

degree 



300 



400 




500 1000 1500 2000 2500 

distance (unn) 



0.05 0.10 



0.15 

cc 



0.20 0.25 




■true CC=0.1 

■ true CC=0.3 

■ true CC=0.5 



0.2 0.4 0.6 0.8 
fraction of links 



0.2 0.4 0.6 0.8 
fraction of links 



1.0 



Figure 9. [FIGURE SI] 



50 



II. 


III. 


IV. 


• 

° . * • • .* 
• • • 

..* 

** • . • * • 
• • • . 
• 




• • 

• • 

* • * . • 

* * • • 



VI. 


VII. 


••• 

• * * ' ***. • 
• * * • . • * 


• 

, • .* 

• 

• 



cell positions 
• hubs 



B 10 

0.8 

I 0-6 
o 

0.4 
o 

0.2 
0.0 

C 

3 

1 ' 

^ 1 

I -1 
-2 



n.s. 



*** **** 



JILLL 







ill 









IV. V VI. VII. 
range 



Figure 10. [FIGURE S2] 



51 



3 

CD 

o 
c 

<D 
O 
(/) 
CD 

o 
3 



2.0 



1.0 



0.0 



experiment A 



15 30 45 
time (s) 




60 




6.0 
3.0 
0.0 



experiment B 



15 30 45 
time (s) 




60 




simulation 



0.8 
0.4 
0.0 







— II. 


J 


u 


— III. 

IV. 



15 30 45 
time (s) 



60 




range 



range 



range 



Figure 11. [FIGURE S3] 




B 



0.524 0.526 0.528 0.530 0.532 0.534 

Structural distance (mm) 



en 
_3 
o 

■D 


o 0.4 

CO 

c 
o 

8 0.2 



0.0 



xc 


Ml 




H '- 


TE 










T^^^ ^^^^^ 





0.1 0.2 0.3 0.4 

Structural clustering 



Figure 12. [FIGURE S4] 



xc 



GC 



Ml 



TE (k=1] 




0.2 0.4 0.6 
cond. level 



0.2 0.4 0.6 0.8 
cond. level 



0.0 



0.2 0.4 0.6 0.8 

cond. level 



I 



0.2 0.4 0.6 0.8 
cond. level 



0.4 



0.8 



Figure 13. [FIGURE S5] 



clustered topology 



local topology 




2000 

TFS 



Figure 14. [FIGURE S6] 



53 




Figure 15. [FIGURE S7] 



